FlowSpec—Apache Spark Pipelines in Production

This talk is about the use of Spark pipelines in Danske bank. The data scientists in the organization use spark pipelines as tools to create uniformity in the features they generate and streamline the modelling process.  Subramaniam Ramasubramanian a software engineer with Danske bank focuses on how a simple prototype tool, FlowSpec, which took a couple of weeks to develop, helped reduce time to market for models, ensure data quality, created fair and clear separation of duties and offers a consolidated solution to recurrent problem scenarios in the arduous process of moving ml models from different teams and departments in a large organization to production.

Links

« GE Aviation Spark Application – Experience Porting Analytics into PySpark ML Pipelines Attribution Done Right »