Filter (clear filters)





Overview for monitoring

Attribution Done Right

Thiago Rigo a software engineer with GetYourGuide takes you through how GetYourGuide developed a solution that cleans and structures logs from different data sources, applies rules to deal with channel assignment, and finally properly weights each channel’s contribution to total revenue generated. The business and technical challenges solved and how the solution was implemented at GetYourGuide using Spark and Databricks.


War Stories: DIY Kafka

This talk explains some problems Zalando experienced while running Kafka brokers and Kafka Streams applications, as well as the consultations the data team had with other experts the same issue.  There are highlights on some of the decisions that were made regarding backups, monitoring and operations to minimize our time spent administering our Kafka brokers and various stream applications.


How Big Fish Games Developed Real-Time Analytics Using Kafka Streams and Elasticsearch

Big Fish Games a leading producer and distributor of casual and mid-core game hase distributed more than 2.5 billion games to customers in 150 countries, representing over 450 unique mobile games and over 3,500 unique PC games. Big Fish Games
uses Apache Kafka to process data generated across game play. Recently, they introduced real-time analytics of game data using Kafka Streams integrated with Elasticsearch. This allows to monitor the results of live operations and to make changes to these events after they have gone live. 

This presentation gives a detailed explanation of how Big Fish Games used Kafka Streams to transform raw data into Elasticsearch documents, and how the application was scaled to over a million daily active users. It will also touch on the limitations discovered with Kafka Connect integration with Elasticsearch and how to use Elasticsearch bulk processing with Kafka Streams. 


Threading Needles in a Haystack: Sessionizing the Uber firehose in realtime

In this discussion learn about Uber's evolution of the Rider Session state machineand challenges involved in managing realtime stateful streaming pipeline across half a dozen event streams and millions of riders who use Uber every day. Covered also are the various aspects of the running the job in production such as managing state checkpointing, monitoring, experiences moving the job from Spark Streaming to Flink and ensuring low latency to downstream systems.