Apache Spark is widely regarded as the most promising large-scale data processing engine right now, and that's why we invited our partner SURFsara to host a workshop on Spark at our Amsterdam office. The workshop took place on Tuesday August 4th, and attending were Xomnia's data science team as well as guests from the IND .
The workshop started off with some of the basics on distributed file systems, Hadoop and MapReduce, and continued into advanced topics like machine learning using MLlib and stream processing. Not much later, we were programming Spark machine learning tasks in Python.
We would like to thank SURFsara for hosting this great workshop. It has been an inspiring day, and we are eager to use our newfound Spark knowledge in practice.