Filter (clear filters)





Overview for health care

AI in practice: how we help cure diseases using Big Data and AI - Chen Admati @ Intel (Hebrew)

Chen Admati (Head of Intel Pharma Analytics Platform at Intel Corporation), discusses AI in practice and how we help cure diseases using Big Data and AI


Unafraid of Change: Optimizing ETL, ML, and AI in Fast-Paced Environments

While processing more data through an existing set of ETL or ML/AI pipelines is easy with Spark, dealing with an ever expanding and/or changing set of pipelines can be quite challenging, all the more so when there are complex inter-dependencies. Workflow-based job orchestration offers some help in the case of relatively static flows but fails miserably when it comes to supporting fast-paced data production such as data science experimentation, ad hoc analytics and root cause analysis. 

This talk introduces three patterns for large-scale data production in fast-paced environments–just-in-time dependency resolution (JDR), configuration-addressed production (CAP) and automated lifecycle management (ALM)–with ETL & ML/AI demos as well as open-source code you can use in your projects. These patterns have been production-tested in Swoop’s petabyte-scale environment where they have significantly increased human productivity and processing flexibility while reducing costs by more than 10x. 


HIPAA Compliant Deployment of Apache Spark on AWS for Healthcare

Healthcare Enterprises like Collective Health have to comply with HIPAA. HIPAA makes scalable analytics very difficult. For Collective Health, after struggling various strict requirements, they came up with a unique deployment option for running Spark on EMR. We used Terraform (by HashiCorp) to build a Spark and Zeppelin cluster on Amazon EMR which is HIPAA compliant.

This solution encrypts all data at rest and in-flight, logs all user activities, as well as satisfies many other primitives of a HIPAA complaint environment. The use of Terraform provides a high degree of management of Cluster Configuration, Data Accessibility, Scalability, Security, and Availability.


Dynamic Healthcare Dataset Generation, Curation, and Quality with PySpark

Learn how the data engineering team at Modernizing Medicine has created an object-oriented dynamic dataset generation framework using Python and Spark. The scalability and performance of Spark allows terabytes of data to be extracted from numerous application servers and combined into Parquet files for analysis in clinical research.


Clinical Decision Making with Machine Learning

Oleksii Barash Ph.D., IVF Laboratory Research Director at the Reproductive Science Center of the San Francisco Bay Area, discusses his team’s approach to applying machine learning for decision making during infertility treatment. Oleksii also gave a quick overview of how he uses Driverless AI to build models for predicting IVF outcomes.