When the number of transformations and ETL jobs grow every team needs a tool to schedule them in a particular order. Airflow is the latest open source scheduling tool in the big data world.
During this training, we’ll dive into Apache Airflow, a highly flexible framework to develop maintainable, complex workflows and schedule them. We’ll look at how Airflow works, what possible use cases you could apply it in, and how it interacts with tools like Apache Spark. Some of the pitfalls will also be discussed. The training will involve a lot of hands-on Python coding, so be prepared to get your hands dirty!