Lessons Learned from the Migration to Apache Airflow - Radek Maciaszek, Skimlinks*

Radek Maciaszek presents his learnings from the migration of machine learning and big data processing pipelines to Apache Airflow.

He highlights how they use Airflow to power their company big data infrastructure where they analyze hundreds of terabytes of data. Examples will cover the building of the ETL pipeline and use of Airflow to manage the machine learning Spark pipeline workflow.

The talk covers the basic Airflow concepts and show real-life examples of how to define your own workflows in the Python code. It finishes with more advanced topics related to Apache Airflow, such as adding custom task operators, sensors and plugins as well as best practices and both the pros and cons of this tool.

Links


Control Plane for Large Mesh in a Heterogeneous Environment - Fuyuan Bie & Zhimeng Shi, Pinterest

Building service mesh in a heterogeneous environment of a large number of clusters is challenging. At Pinterest,  they have a complicated mixture of thousands of clusters ranging from IaaS to dockerized services to kubernetes; They are developed with C++/Java/Python/Node/Go/Elixir.Using open source go control plane as the interface to Envoy, the data engineering team at Pinterest meshed Pinterest services with a control plane namely tower they developed. From edge to backends, 100% services are managed by Tower. They use actor model and event sourcing to make it performant, reliable, scalable and extensible.

Links


Governance on K8s: How to Solve Ownership, Metering & Capacity Planning - Micheal Benedict & Yongwen Xu, Pinterest

Pinterest is a cloud first visual discovery engine that serves over 250MM users. To support this scale, there are thousands of services running on tens of thousands of hosts, processing 300+PB of data. Pinterest operates large kubernetes clusters across several availability zones, across regions. The cluster is auto scaled with support for pod level auto-scaling. Finally,to effectively utilize resources within the clusters, Pinterest operates heterogeneous workloads on a kitchen sink of instance types. 

Links


Tinder's Move to Kubernetes - Chris O'Brien & Chris Thomas, Tinder

This discussion is about how Tinder moved it's platform to Kubernetes, challenges encountered along the way and how they were solved. 

Links


Kubernetizing Big Data and ML Workloads at Uber - Mayank Bansal & Min Cai, Uber

Uber relies on Big Data and ML to make business critical decisions such as pricing, trip ETA, etc. Today, those workloads such as Hive and Spark are running on YARN. To save millions of dollars by efficient use of cluster resources, Uber is planning to use Kubernetes to co-locate BigData/ML and micro-service workloads.

This talk will covers the following:
- Learnings of running large-scale BigData/ML on Kubernetes with Peloton
- Colocation of mixed workloads
- Federation across zones
- Feature and API parity with YARN

Links


Handling Risky Business: Cluster Upgrades - Puneet Pruthi, Lyft

This talk is about how Lyft has solved the complexity of automating cluster upgrades and how that is incorporated into the design for - k8srotator - a Kubernetes custom controller.

Links


Seamless Customer Experience at Walmart Stores Powered by Kubernetes@Edge - Maneesh Vittolia, Principal Architect & Sriram Komma, Principal Product Owner, Walmart

At Walmart, while major application software can and does operate in the cloud, stores or any client edge compute cannot avoid the intermittent network events that can create less than ideal availability and performance of the software during those times.  This can lead to poor customer experience and/or failed transactions during checkout. Because of Walmart's scale of serving around 265 million customer every week, the comnbined effect on customer experience as well as the loss of revenue is pretty huge.

To overcome the issue between Stores and cloud, Walmart is building and rolling out the next generation of Point of Sale (POS) systems on highly resource constraint edge computing environment using modern service mesh based technologies designed to allow maximum business flexibility, extreme performance and rapid deployment and powered by Kubernetes.

Links


Serverless Platform for Large Scale Mini-Apps: From Knative to Production - Yitao Dong & Ke Wang, Ant Financial

Ke Wang and Yitao Dong from Ant Financial share the key workloads they are building with serverless and how they address pain points in production by expanding Knative. They introduce technical details of adopting Knative with secure container runtime and reinventing Knative control/data plane, which largely saves deployment and operation efforts to enable serverless in Kubernetes clusters. The discussion will also cover a quick demo to illustrate improved serverless app lifecycle management, 0-M-N-0 autoscaling performance and operation workflow.

Links


Kubernetes at Cruise: Two Years of Multitenancy - Karl Isenberg, Cruise

This talk is about how Cruise has been using Kubernetes. It highlights the motivations, story, and results of migrating to multitenant Kubernetes, along with some hard-earned Pro Tips from the trenches. You’ll also learn about the open source tooling they built around Spinnaker, Vault, Google Cloud, and Istio in order to integrate with our multitenant Kubernetes.

Links


Containing the Container: Developer Experience vs Strict Security Posture - Brian Bagdzinski & Sharat Nellutla, Verizon

Verizon IT we manages multiple multi-tenant Kubernetes clusters across on-prem and multiple clouds hosting hundreds of applications. Containers, Kubernetes, and cloud-native are central pillars: both for application modernization strategy, and for our north star architecture. This discussiion is about evolving the developer experience in this space, despite the security constraints, leveraging open source tooling such as Skaffold, Harbor, Kaniko, and Jib.

Links