Latest entries

Lessons Learned from the Migration to Apache Airflow - Radek Maciaszek, Skimlinks*

Radek Maciaszek presents his learnings from the migration of machine learning and big data processing pipelines to Apache Airflow.

He highlights how they use Airflow to power their company big data infrastructure where they analyze hundreds of terabytes of data. Examples will cover the building of the ETL pipeline and use of Airflow to manage the machine learning Spark pipeline workflow.

The talk covers the basic Airflow concepts and show real-life examples of how to define your own workflows in the Python code. It finishes with more advanced topics related to Apache Airflow, such as adding custom task operators, sensors and plugins as well as best practices and both the pros and cons of this tool.

Links


Control Plane for Large Mesh in a Heterogeneous Environment - Fuyuan Bie & Zhimeng Shi, Pinterest

Building service mesh in a heterogeneous environment of a large number of clusters is challenging. At Pinterest,  they have a complicated mixture of thousands of clusters ranging from IaaS to dockerized services to kubernetes; They are developed with C++/Java/Python/Node/Go/Elixir.Using open source go control plane as the interface to Envoy, the data engineering team at Pinterest meshed Pinterest services with a control plane namely tower they developed. From edge to backends, 100% services are managed by Tower. They use actor model and event sourcing to make it performant, reliable, scalable and extensible.

Links


Governance on K8s: How to Solve Ownership, Metering & Capacity Planning - Micheal Benedict & Yongwen Xu, Pinterest

Pinterest is a cloud first visual discovery engine that serves over 250MM users. To support this scale, there are thousands of services running on tens of thousands of hosts, processing 300+PB of data. Pinterest operates large kubernetes clusters across several availability zones, across regions. The cluster is auto scaled with support for pod level auto-scaling. Finally,to effectively utilize resources within the clusters, Pinterest operates heterogeneous workloads on a kitchen sink of instance types. 

Links


Tinder's Move to Kubernetes - Chris O'Brien & Chris Thomas, Tinder

This discussion is about how Tinder moved it's platform to Kubernetes, challenges encountered along the way and how they were solved. 

Links


Kubernetizing Big Data and ML Workloads at Uber - Mayank Bansal & Min Cai, Uber

Uber relies on Big Data and ML to make business critical decisions such as pricing, trip ETA, etc. Today, those workloads such as Hive and Spark are running on YARN. To save millions of dollars by efficient use of cluster resources, Uber is planning to use Kubernetes to co-locate BigData/ML and micro-service workloads.

This talk will covers the following:
- Learnings of running large-scale BigData/ML on Kubernetes with Peloton
- Colocation of mixed workloads
- Federation across zones
- Feature and API parity with YARN

Links


Handling Risky Business: Cluster Upgrades - Puneet Pruthi, Lyft

This talk is about how Lyft has solved the complexity of automating cluster upgrades and how that is incorporated into the design for - k8srotator - a Kubernetes custom controller.

Links