real time analytics @ netflix

38
Real-Time Analytics @ Netflix Cody Rioux - @codyrioux Real-Time Analytics - Insight Engineering

Upload: cody-rioux

Post on 16-Apr-2017

504 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Real time analytics @ netflix

Real-Time Analytics @ Netflix

Cody Rioux - @codyriouxReal-Time Analytics - Insight Engineering

Page 2: Real time analytics @ netflix

Overview.● Real-Time Analytics

○ Anomaly / Outlier Detection○ Canary Analysis

● Architecture● Challenges

○ Cold Start○ Concept Drift○ Configuration○ Change Deployment○ User Acceptance

Page 3: Real time analytics @ netflix

We are drowning in information but starved for knowledge.- John Naisbitt

Real-Time Analytics

Page 4: Real time analytics @ netflix

Real-Time Analytics● Part of Insight Engineering.● Build systems that make intelligent decisions about our operational environment.

○ Make decisions in near real-time.○ Automate actions in the production environment.

● Support operational availability and reliability.

Page 5: Real time analytics @ netflix

One of these things is not like the others.

Anomaly and Outlier Detection

Page 6: Real time analytics @ netflix

Unexpected value for a given generating mechanism.

Page 7: Real time analytics @ netflix

Terminology

Outlier Anomaly

Page 8: Real time analytics @ netflix
Page 9: Real time analytics @ netflix

Good builds gone bad!

Automated Canary Analysis

Page 10: Real time analytics @ netflix

Old Version (v1.0)

New Version(Canary - v1.1)

Load Balancer

Customers

88 Servers

6 Servers

Metrics

Netflix Canary Release Process.

Old Version(Control - v1.0)

6 Servers

Analysis

Page 11: Real time analytics @ netflix

A Data Scientist’s capability to extract value from data is largely coupled with the maturity of the data platform of its company. - Robert Chang

Analytic Architecture

Page 12: Real time analytics @ netflix

CSI (REST)Customers

First Generation Architecture

Page 13: Real time analytics @ netflix

OpenPy

Load BalancerCustomers

Models

Second Generation Architecture

OpenPy

...

RTA Data PollerTelemetry Data

Page 14: Real time analytics @ netflix

Heating things up beginning at absolute zero

Challenge Zero: Cold Start

Page 16: Real time analytics @ netflix
Page 17: Real time analytics @ netflix

OpenPy

Load BalancerCustomers

Models

New Architecture

OpenPy

...

RTA Data PollerTelemetry Data

Data TaggerData Store

Page 18: Real time analytics @ netflix

Change in data stream over time.

Challenge One: Concept Drift

Page 20: Real time analytics @ netflix

Solutions: Concept Drift

Monitoring the behavior of

analytics and soliciting user

feedback.

Page 21: Real time analytics @ netflix

OpenPy

Load Balancer

CustomersModels

New Architecture

OpenPy

...

RTA Data PollerTelemetry Data

Data TaggerData Store

Feedback

Page 22: Real time analytics @ netflix

Removing the burden of configuration complexity for the user.

Challenge Two: Configuration

Page 23: Real time analytics @ netflix

Configurations are complex.

Assumptions and meta-analytics eliminate user burden.

Page 24: Real time analytics @ netflix

General is inherently more

complex than specific.

Page 25: Real time analytics @ netflix

OpenPy

Load Balancer

CustomersModels

New Architecture

OpenPy

...

RTA Data PollerTelemetry Data

Data TaggerData Store

Feedback

Page 26: Real time analytics @ netflix

Progress is impossible without change, and those who cannot change their minds cannot change anything - George Bernard Shaw

Challenge Three: Change Deployment

Page 27: Real time analytics @ netflix

OpenPy

Load Balancer

CustomersModels

New Architecture

OpenPy

...

RTA Data PollerTelemetry Data

Data TaggerData Store

Feedback

Page 28: Real time analytics @ netflix

Rest API

/REST/v1/anomaly

/REST/v2/anomaly

/REST/v3/anomaly

...

Page 29: Real time analytics @ netflix

Mantis (Stream Processor)

CustomersModels

New Architecture

SparkTelemetry

Data

Data Tagger Data Store

Feedback

Fact Table

Versioned JAR Files

Page 30: Real time analytics @ netflix

“Machine learning is really good at partially solving just about any problem.” - cdixon

Challenge Four: User Acceptance

Page 31: Real time analytics @ netflix

User Acceptance● Understandable analytics.● Favor probabilities for inputs and outputs.● Conceptual documentation.

Page 32: Real time analytics @ netflix

User Acceptance

?

Page 33: Real time analytics @ netflix

Our platform is less bad than it used to be. :)

Recap

Page 34: Real time analytics @ netflix

Mantis (Stream Processor)

CustomersModels

New Architecture

SparkTelemetry

Data

Data Tagger Data Store

Feedback

Fact Table

Versioned JAR Files

Page 35: Real time analytics @ netflix

Recap● Cold Start: Data Tagging● Concept Drift: Feedback Loop● Configuration: Assumptions and Meta Analytics● Change Deployment: Versioned Analytics● User Acceptance: Docs, probabilities, understandable analytics.

Page 36: Real time analytics @ netflix

Literature

Machine Learning: The High

Interest Credit Card of Technical

Debt (Sculley et al., 2014)

Page 37: Real time analytics @ netflix

Literature● Practical Machine Learning: A New Look at Anomaly Detection (Dunning, 2014)● Distinguishing cause from effect using observational data: methods and benchmarks

(Mooij et al., 2014)● Enhancing Performance Prediction Robustness by Combining Analytical Modeling

and Machine Learning (Didona et al., 2015)

Page 38: Real time analytics @ netflix

[email protected]@codyriouxlinkedin.com/in/codyrioux