Filter (clear filters)






Overview for spark-mllib

How to Rebuild an End-to-End ML Pipeline with Databricks and Upwork

In this talk, Thanh Tran a Director of Data Science for Upwork presents their modernization efforts in moving towards a

1) holistic data processing infrastructure for batch and stream data processing using S3, Kinesis, Spark and Spark Structured Streaming

2) model development using Spark MLlib and other ML libraries for Spark

3) model serving using Databricks Model Scoring, Scoring over Structured Streams and microservices and

4) how they orchestrate and streamline all these processes using Apache Airflow and a CI/CD workflow customized to our Data Science product engineering needs.


Classifying Text in Money Transfers: A Use Case of Apache Spark in Production for Banking

Learn how BBVA the second biggest bank in Spain uses Spark. This discussion focuses on the process undergone by the Data Science team. This includes the problem (classify 700K daily transfers by its text), the data science challenges, the algorithmic and engineering solution, and the achievements and learnings.


Intro to Building a Distributed Pipeline for Real Time Analysis of Uber

Applying Machine Learning to IOT: End to end distributed pipeline for real- time Uber data using Apache Apis: Kafka, Spark, Hbase


Large-Scale Machine Learning: Use Cases and Technologies

This talk highlights Yahoo use cases where big data and machine learning technologies are best exemplified. It explains algorithm/system challenges to scale ML algorithms for massive datasets. Learn about a technical overview of CaffeOnSpark and TensorFlowOnSpark to jumpstart your journey of large-scale machine learning.


Transforming B2B Sales with Spark-Powered Sales Intelligence

Learn how Linkedin is utilizing Spark for building sales intelligence products. This discussion introduces a comprehensive B2B intelligence system built on top of various open source stacks. The system puts advanced data science to work in a dynamic and complex scenario, in an easily controllable and interpretable way. Balancing flexibility and complexity, the system can deal with various problems in a unified manner and yield actionable insights to empower successful business. You will also learn about some impactful Spark-ML powered applications such as prospect prediction and prioritization, churn prediction, model interpretation, as well as challenges and lessons learned at LinkedIn while building such platform.