See how Linkedin is using Brooklin Mirror Maker (BMM) to provide improved performance and stability at the same facilitating better management through finer control of data pipelines.
Learn how Linkedin is using Apache Samza in stream processing.
Learn how New York Times uses Kafka for data storage.
This discussion presents the evolution “People You May Know” (PYMK) to its current architecture. The focus is on various systems built along the way, with an emphasis on systems built for LinkedIn most recent architecture, namely Gaia, a real-time graph computing capability, and Venice an online feature store with scoring capability, and how LinkedIn integrates these individual systems to generate recommendations in a timely and agile manner, while still being cost-efficient.
Miguel Angel Campo highlights how the data team from Century Fox developed a system based on Collaborative (Deep) Metric Learning (CML) to predict the purchase probabilities of new theatrical releases. He explaind how they trained and evaluated the model using a large dataset of customer histories spanning multiple years, and tested the model for a set of movies that were released outside of the training window. Initial experiments show gains relative to models that don't train on collaborative preferences.
The New York Times has over 3.6 million paid print and digital subscriptions. Lean how New York Times changed its its infrastructure at the core to keep up with the new expectations of the digital age and its consumers. Every piece of content that has been published by The New York Times throughout the past 166 years and counting is stored in Apache Kafka.
In this talk, Boerge Svingen The New York Times' Director of Engineering highlights the following:
-An overview of what the publishing infrastructure used to look like
-Deep dive into the log-based architecture of The New York Times’ Publishing Pipeline
-The schema, monolog and skinny log used for storing articles
-Challenges and lessons learned
-Answers live questions submitted by the audience