Filter (clear filters)






Overview for hbase

Real-Time Detection of Anomalies in the Database Infrastructure using Apache Spark

Learn how CERN, the biggest physics laboratory in the world,has large volumes of data are generated every hour, stored and processed using scalable systems as Hadoop, Spark and HBase.


Intro to Building a Distributed Pipeline for Real Time Analysis of Uber

Applying Machine Learning to IOT: End to end distributed pipeline for real- time Uber data using Apache Apis: Kafka, Spark, Hbase


Using Spark to Analyze Activity and Performance in High Speed Trading Environments

When it comes to processing and analyzing large-scale data for the trading of hundreds of millions of dollars, every bit matters. The insights derived from analysis in these environments are highly valuable to the financial audience. In this session, learn how using Spark to analyze trading activity in the world’s largest financial institutions. We’ll explore: – Why Spark is the right tool for analysis in these intense environments – Optimizing Spark applications to work with high volume real-time trading data – How to source authoritative, accurate monitoring data without impacting trading system performance


Dynamic DDL: Adding Structure to Streaming Data on the Fly

GoPro has massive amounts of heterogeneous data being streamed from their consumer devices and applications, and they have developed the concept of “dynamic DDL” to structure their streamed data on the fly using Spark Streaming, Kafka, HBase, Hive and S3. The idea is simple: Add structure (schema) to the data as soon as possible; allow the providers of the data to dictate the structure; and automatically create event-based and state-based tables (DDL) for all data sources to allow data scientists to access the data via their lingua franca, SQL, within minutes.


Building Data Product Based on Apache Spark at Airbnb

Building data product requires having Lambda Architecture to bridge the batch and streaming processing. AirStream is a framework built on top of Apache Spark to allow users to easily build data products at Airbnb. It proved Spark is impactful and useful in the production for mission-critical data products.


Applying Machine learning to IOT: End to End Distributed Distributed Pipeline for Realtime Uber Montoring Dashboard using Apache APIs: Kafka Spark HBase

Carol McDonald discusses the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered, so as to analyze and visualize the most popular Uber locations.