Filter (clear filters)





Overview for predictive analytics

Using Apache Spark to Predict Installer Retention from Messy Clickstream Data

Learn how Zynga utilizes the power of PySpark to  generate thousands of features without the need to manually interpret the events of each game.


Oversubscribing Apache Spark Resource Usage for Fun and $$$

Apache Spark is quickly being adopted at Facebook and now powers an important portion of Facebook’s batch ETL workload. While Spark is typically more efficient than Hive,Facebook continues to search for opportunities to further reduce hardware costs. Recently, they  started an effort to apply custom resource oversubscription for every unique Spark job.


Image Similarity Detection at Scale Using LSH and Tensorflow

Learning over images and understanding the quality of content play an important role at Pinterest. This talk presents a Spark based system responsible for detecting near (and far) duplicate images. The system is used to improve the accuracy of recommendations and search results across a number of production surfaces at Pinterest.


Merchant Churn Prediction Using SparkML at PayPal

In this discussion PayPal will presents the techniques used to retain merchants using some of the Machine Learning models using SparkML platform.


Scaling Machine Learning at with H2O Sparkling Water and FeatureStore

In this talk Luca Falsina and Brammert Ottens from share their journey from the origins, where models were very much hand-crafted, till nowadays, where they have tooling for discovering and building reusable online and offline features and self-service tools to deploy models in production quickly. They highlight how Spark and H2O’s Sparkling Water play a key role for building scalable models with large training datasets while allowing fast-predictions on


AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Technologies

Suqiang Song, a director and chapter leader at Mastercard shares the vision and the production journey of how wthey build enterprise shared AI As A Service platforms with distributed deep learning technologies.