strata 2014 anomaly detection

56
© MapR Technologies, confidential ® © MapR Technologies, confidential ® Anomaly Detection

Upload: ted-dunning

Post on 08-Sep-2014

4.846 views

Category:

Technology


0 download

DESCRIPTION

This talk describes the general architecture common to anomaly detections systems that are based on probabilistic models. By examining several realistic use cases, I illustrate the common themes and practical implementation methods.

TRANSCRIPT

Page 1: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

®

Anomaly Detection

Page 2: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Agenda

•  What is anomaly detection? •  Some examples •  Some generalization •  More interesting examples •  Sample implementation methods

Page 3: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Who I am

•  Ted Dunning, Chief Application Architect, MapR [email protected] [email protected] @ted_dunning

•  Committer, mentor, champion, PMC member on

several Apache projects •  Mahout, Drill, Zookeeper others

Page 4: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Who we are

•  MapR makes the technology leading distribution including Hadoop

•  MapR integrates real-time data semantics directly into a system that also runs Hadoop programs seamlessly

•  The biggest and best choose MapR –  Google, Amazon –  Largest credit card, retailer, health insurance, telco –  Ping me for info

Page 5: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

What is Anomaly Detection?

•  What just happened that shouldn’t? –  but I don’t know what failure looks like (yet)

•  Find the problem before other people see it –  especially customers and CEO’s

•  But don’t wake me up if it isn’t really broken

Page 6: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Page 7: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Looks pretty anomalous

to me

Page 8: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Will the real anomaly please stand up?

Page 9: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

What Are We Really Doing

•  We want action when something breaks (dies/falls over/otherwise gets in trouble)

•  But action is expensive •  So we don’t want false alarms •  And we don’t want false negatives

•  We need to trade off costs

Page 10: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

A Second Look

Page 11: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

A Second Look

99.9%-ile

Page 12: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

With Spikes

Page 13: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

With Spikes

99.9%-ile including spikes

Page 14: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

How Hard Can it Be?

Online Summarizer 99.9%-ile

t

x > t ? Alarm ! x

Page 15: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

On-line Percentile Estimates

•  Apache Mahout has on-line percentile estimator –  very high accuracy for extreme tails –  new in version 0.9 !!

•  What’s the big deal with anomaly detection?

•  This looks like a solved problem

Page 16: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Spot the Anomaly

Anomaly?

Page 17: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Maybe not!

Page 18: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Where’s Waldo?

This is the real anomaly

Page 19: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Normal Isn’t Just Normal

•  What we want is a model of what is normal

•  What doesn’t fit the model is the anomaly

•  For simple signals, the model can be simple …

•  The real world is rarely so accommodating

x ~ N(0,ε)

Page 20: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 21: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 22: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 23: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 24: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 25: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 26: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 27: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 28: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 29: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 30: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 31: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 32: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 33: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 34: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

We Do Windows

Page 35: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Windows on the World

•  The set of windowed signals is a nice model of our original signal

•  Clustering can find the prototypes –  Fancier techniques available using sparse coding

•  The result is a dictionary of shapes •  New signals can be encoded by shifting, scaling

and adding shapes from the dictionary

Page 36: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Most Common Shapes (for EKG)

Page 37: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Reconstructed signal

Original signal

Reconstructed signal

Reconstruction error

Page 38: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

An Anomaly

Original technique for finding 1-d anomaly works against reconstruction error

Page 39: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Close-up of anomaly

Not what you want your heart to do. And not what the model expects it to do.

Page 40: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

A Different Kind of Anomaly

Page 41: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Model Delta Anomaly Detection

Online Summarizer

δ > t ?

99.9%-ile

t

Alarm !

Model

-

+ δ

Page 42: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

The Real Inside Scoop

•  The model-delta anomaly detector is really just a mixture distribution –  the model we know about already –  and a normally distributed error

•  The output (delta) is (roughly) the log probability of the mixture distribution

•  Thinking about probability distributions is good

Page 43: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Example: Event Stream (timing)

•  Events of various types arrive at irregular intervals –  we can assume Poisson distribution

•  The key question is whether frequency has changed relative to expected values

•  Want alert as soon as possible

Page 44: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Poisson Distribution

•  Time between events is exponentially distributed

•  This means that long delays are exponentially rare

•  If we know λ we can select a good threshold –  or we can pick a threshold empirically

Δt ~ λe−λt

P(Δt > T ) = e−λT

− logP(Δt > T ) = λT

Page 45: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Converting Event Times to Anomaly

99.9%-ile

99.99%-ile

Page 46: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Example: Event Stream (sequence)

•  The scenario: –  Web visitors are being subjected to phishing attacks –  The hook directs the visitors to a mocked login –  When they enter their credentials, the attacker uses

them to log in –  We assume captcha’s or similar are part of the

authentication so the attacker has to show the images on the login page to the visitor

•  The problem: –  We don’t really know how the attack works

Page 47: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Normal Event Flow

Image-1

Image-2

Image-3

Image-4

login

Page 48: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

Phishing Flow

Image-1

Image-1

Image-1

Image-1

login

Image-1

Image-1

Image-1

Image-1

login

Fake login

Real login

Page 49: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Key Observations

•  Regardless of exact details, there are patterns •  Event stream per user shows these patterns

•  Phishing will have different patterns at much lower rate

•  Measuring statistical surprise gives a good anomaly (fraud or malfunction) indicator

Page 50: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Recap (out of order)

•  Anomaly detection is best done with a probability model

•  Deep learning is a neat way to build this model –  converting to symbolic dynamics simplifies life

•  -log p is a good way to convert to anomaly measure

•  Adaptive quantile estimation works for auto-setting thresholds

Page 51: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Recap

•  Different systems require different models •  Continuous time-series

–  sparse coding or deep learning to build signal model

•  Events in time –  rate model base on variable rate Poisson –  segregated rate model

•  Events with labels –  language modeling –  hidden Markov models

Page 52: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

How Do I Build Such a System

•  The key is to combine real-time and long-time –  real-time evaluates data stream against model –  long-time is how we build the model

•  Extended Lambda architecture is my favorite

•  See my other talks on slideshare.net for info

•  Ping me directly

Page 53: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

t

now

Hadoop is Not Very Real-time

UnprocessedData

Fully processed

Latest full

period

Hadoop job takes this long for this data

Page 54: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

t

now

Hadoop works great back here

Storm works here

Real-time and Long-time together

Blended view

Blended view

Blended View

Page 55: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential

Who I am

•  Ted Dunning, Chief Application Architect, MapR [email protected] [email protected] @ted_dunning

•  Committer, mentor, champion, PMC member on

several Apache projects •  Mahout, Drill, Zookeeper others

Page 56: Strata 2014 Anomaly Detection

© MapR Technologies, confidential ®

© MapR Technologies, confidential ®