Data Eng Weekly
In this talk Shir Bromberg a Big Data team leader at Yotpo,discusses their open-source dockers for running Spark on Nomad servers. She highlights the following;
* The issues they had running spark on managed clusters and the solutions developed.
* How to build a spark docker.
* What to achieve by using Spark on Nomad.
Nielsen Marketing Cloud needs to ingest billions of events per day into their big data stores for their real time analytics. Etti Gur the Senior Big Data developer and Itai Yaffe Tech Lead, Big Data group discuss how they significantly optimized Spark-based in-flight analytics daily pipeline, reducing its total execution time from over 20 hours down to 2 hours, resulting in a huge cost reduction.
Omkar Joshi a senoir software engineer at Uber discusses a new Spark ingestion system known as Marmaray. This new system has been designed to ingest billions of Kafka messages at intervals of 30 minutes.
This talk is about how LinkedIn customizes Apache Kafka for 7 trillion messages per day.
Running Kafka on Kubernetes is becoming more and more popular. Frank Pientka, Principal Software Architect, Materna Information & Communications SE introduces a setup, used components and recommendations from an own project with Kafka on Kubernetes.He shares the lessons learned from this still evolving field.
Patrik Kleindl discusses the use of Kafka at BearingPoint.