SHOW

Filter (clear filters)

Domains

Companies

Technologies

Functions


Overview for Data Eng Weekly

The Benefits of Running Spark on your own Docker

In this talk Shir Bromberg a Big Data team leader at Yotpo,discusses their open-source dockers for running Spark on Nomad servers. She highlights the following; 
* The issues they  had running spark on managed clusters and the solutions developed.
* How to build a spark docker.
* What to achieve by using Spark on Nomad.

Links


Optimizing Spark-based data pipelines - are you up for it?

Nielsen Marketing Cloud needs to ingest billions of events per day into their big data stores for their real time analytics. Etti Gur  the Senior Big Data developer and Itai Yaffe Tech Lead, Big Data group discuss how they significantly optimized Spark-based in-flight analytics daily pipeline, reducing its total execution time from over 20 hours down to 2 hours, resulting in a huge cost reduction.
Links


How to performance-tune Spark applications in large clusters

Omkar Joshi a senoir software engineer at Uber discusses a new Spark ingestion system known as Marmaray. This new system has been designed to ingest billions of Kafka messages at intervals of 30 minutes. 

Links



The Need for Speed – Data Streaming in the Cloud with Kafka®

Running Kafka on Kubernetes is becoming more and more popular. Frank Pientka, Principal Software Architect, Materna Information & Communications SE introduces a setup, used components and recommendations from an own project with Kafka on Kubernetes.He shares the lessons learned from this still evolving field. 

Links