midas: microcluster-based detector of anomalies in edge ...sbhatia/assets/pdf/midasslides.pdf ·...

34
1 MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, Christos Faloutsos [email protected] @siddharthb_

Upload: others

Post on 23-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

1

MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams

Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, Christos [email protected]

@siddharthb_

Page 2: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

2Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 3: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Intrusion Detection

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 3Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 4: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Time Evolving Graphs

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

Computer Networks Twitter / IM Networks

4Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 5: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Edge Streams

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

u1

u2

u3

u4

u5

a1 a2a3

u6

5Introduction Method Experiments ConclusionProblem Related Work

𝑢! 𝑢" 𝑢# 𝑢$ 𝑢$ 𝑢% 𝑢% 𝑎! 𝑎! 𝑎! 𝑎" 𝑎" 𝑎" 𝑎# 𝑎# 𝑎#1 2 2 3 4 4 5 6 6 6 6 6 6 6 6 6

𝑢" 𝑢! 𝑢% 𝑢# 𝑢& 𝑢& 𝑢& 𝑢" 𝑢# 𝑢% 𝑢" 𝑢% 𝑢& 𝑢# 𝑢% 𝑢&

Time

Future Work

Page 6: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Roadmap

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 6Introduction Method Experiments ConclusionProblem Related Work

• Problem• Algorithm•MIDAS•MIDAS-R

• Related Work • Experiments• Future Work

Future Work

Page 7: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Problem

T3.1: Detecting Anomalous Dynamic Subgraphs

Input:• Edge stream 𝑬 from time evolving graph 𝑮• Directed, multigraph, discrete time

Output:• Anomaly Score for each edge

Our Contributions:• Microcluster Detection• Guarantees on False Positive Probability• Constant Memory• Constant Update Time

7Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 8: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Roadmap

T3.1: Detecting Anomalous Dynamic Subgraphs 8Introduction Method Experiments ConclusionProblem Related Work

• Problem• Algorithm•MIDAS•MIDAS-R

• Related Work • Experiments• Future Work

Future Work

Page 9: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

T3.1: Detecting Anomalous Dynamic Subgraphs

Background

T3.1: Detecting Anomalous Dynamic Subgraphs

Offline:• Time series methods

Online:• Bounded memory• Set of nodes is not fixed

9Introduction Method Experiments ConclusionProblem Related Work

Detecting Microcluster

Future Work

Page 10: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

Gaussian distribution:

• Find mean & standard deviation

• Compute Gaussian likelihood(edge occurrences at 𝑡)

• if 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 < 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑, declare anomaly

10Introduction Method Experiments ConclusionProblem Related Work Future Work

Naive Solution

Page 11: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

Past(t<10)

Current(t=10)

Expected 900 100Observed 0 1000

11Introduction Method Experiments ConclusionProblem Related Work

Mean Level at 𝑡 = 10 equals Mean Level for 𝑡 < 10

Our assumption

Future Work

Page 12: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

T3.1: Detecting Anomalous Dynamic Subgraphs

𝑠𝑢𝑣 : 𝑢 − 𝑣 edges up to time 𝑡𝑎𝑢𝑣 : 𝑢 − 𝑣 edges at current time 𝑡

�̂�𝑢𝑣 : Approximate total count#𝑎𝑢𝑣 : Approximate current count

12Introduction Method Experiments ConclusionProblem Related Work

Streaming Data Structure: CMS

Future Work

Details

4𝑎𝑢𝑣 ≤ 𝑎𝑢𝑣 + 𝜐𝑁𝑡 with probability at least 1 − 𝜀

𝜐 is the amount of error we can tolerate.1 − 𝜀 is the probability.e.g. with 99% probability only up to 0.5% error

Page 13: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

T3.1: Detecting Anomalous Dynamic Subgraphs

expectedobserved

Details

T3.1: Detecting Anomalous Dynamic Subgraphs 13Introduction Method Experiments ConclusionProblem Related Work Future Work

Anomaly Score: Chi-Squared Test

Page 14: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Roadmap

T3.1: Detecting Anomalous Dynamic Subgraphs 14Introduction Method Experiments ConclusionProblem Related Work

• Problem• Algorithm•MIDAS•MIDAS-R

• Related Work• Experiments• Future Work

Future Work

Page 15: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

MIDAS-R: Incorporating Relations

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

Details

Temporal Relations:• Allows past edges to count toward the current time tick• Diminishing weight• At end of every time tick 𝑡, reduce counts by 𝛼 𝜖 [0,1]

15Introduction Method Experiments ConclusionProblem Related Work

1

2

3

4

5

6

7

8

9

444

888

Future Work

Spatial Relations:• Catch large groups of spatially nearby edges• 𝑠𝑢 : edges adjacent to node 𝑢 up to time 𝑡• 𝑎𝑢 : edges adjacent to node 𝑢 at current time 𝑡• max(𝑠𝑐𝑜𝑟𝑒 𝑢, 𝑣, 𝑡 , 𝑠𝑐𝑜𝑟𝑒 𝑢, 𝑡 , 𝑠𝑐𝑜𝑟𝑒 𝑣, 𝑡 )

Page 16: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Time and Memory Complexity

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

𝑑: number of hash functions𝑤: number of buckets

Space complexity:• 𝑂(𝑤𝑑)

Time complexity:• 𝑂(𝑑)

16Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 17: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Roadmap

• Problem• Algorithm•MIDAS•MIDAS-R

• Related Work • Experiments• Future Work

T3.1: Detecting Anomalous Dynamic Subgraphs 17Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 18: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Related Work

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

Anomaly Detection in Static Graphs:• Node detection: CATCHSYNC [2016], ODDBALL [2010]• Subgraph detection: [SHIN ET AL., 2018], FRAUDER [2017]• Edge detection: NRMF [2011], AUTOPART [2004]

Anomaly Detection in Graph Streams:• Node detection: DTA [2006] • Subgraph detection: CATCHSYNC [2016], COPYCATCH [2013]• Edge detection: ANOMRANK [2019], SPOTLIGHT [2018]

Anomaly Detection in Edge Streams:• Node detection: [YU ET AL., 2013]• Subgraph detection: DENSEALERT [2017]• Edge detection: SEDANSPOT [2018], RHSS [2016]

18Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 19: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Comparison

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 19Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 20: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Roadmap

20Introduction Method Experiments ConclusionProblem Related Work

• Problem• Algorithm•MIDAS•MIDAS-R

• Related Work • Experiments• Future Work

Future Work

Page 21: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Datasets

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

1. DARPA: 4.5M IP-IP communications, 87.7K minutes 2. TwitterSecurity: 2.6M tweets (May-August, 2014)3. TwitterWorldCup: 1.7M tweets (June-July, 2014)

21Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 22: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Accuracy

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 22Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 23: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Running Times

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 23Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 24: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Accuracy vs Time

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 24Introduction Method Experiments ConclusionProblem Related Work

Accu

racy

(AU

C)

0

0.333

0.667

1

Wall-clock time (s)0.1 1 10 100

0.64

0.91 0.95

Midas-R Midas SedanSpot

Future Work

Page 25: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Scalability

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 25Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 26: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Real-World Effectiveness

26T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

1. 13-05-2014. Turkey Mine Accident2. 24-05-2014. Raid3. 30-05-2014. Attack/Ambush

03-06-14. Suicide bombing4. 09-06-14. Suicide/Truck bombings5. 10-06-2014. Iraqi Militants Seize

11-06-2014. Kidnapping6. 15-06-14. Mpeketoni attack7. 26-06-14. Suicide Bombing8. 03-07-14. Gaza conflicts9. 18-07-14. Airplane shot10. 30-07-14. Ebola Virus Outbreak

Page 27: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Real-World Effectiveness

27T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

1. 13-05-2014. Turkey Mine Accident2. 24-05-2014. Raid3. 30-05-2014. Attack/Ambush

03-06-14. Suicide bombing4. 09-06-14. Suicide/Truck bombings5. 10-06-2014. Iraqi Militants Seize

11-06-2014. Kidnapping6. 15-06-14. Mpeketoni attack7. 26-06-14. Suicide Bombing8. 03-07-14. Gaza conflicts9. 18-07-14. Airplane shot10. 30-07-14. Ebola Virus Outbreak

Page 28: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Real-World Effectiveness

28T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

1. 13-05-2014. Turkey Mine Accident2. 24-05-2014. Raid3. 30-05-2014. Attack/Ambush

03-06-14. Suicide bombing4. 09-06-14. Suicide/Truck bombings5. 10-06-2014. Iraqi Militants Seize

11-06-2014. Kidnapping6. 15-06-14. Mpeketoni attack7. 26-06-14. Suicide Bombing8. 03-07-14. Gaza conflicts9. 18-07-14. Airplane shot10. 30-07-14. Ebola Virus Outbreak

Page 29: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Real-World Effectiveness

29T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

1. 13-05-2014. Turkey Mine Accident2. 24-05-2014. Raid3. 30-05-2014. Attack/Ambush

03-06-14. Suicide bombing4. 09-06-14. Suicide/Truck bombings5. 10-06-2014. Iraqi Militants Seize

11-06-2014. Kidnapping6. 15-06-14. Mpeketoni attack7. 26-06-14. Suicide Bombing8. 03-07-14. Gaza conflicts9. 18-07-14. Airplane shot10. 30-07-14. Ebola Virus Outbreak

Page 30: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Real-World Effectiveness

30T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs

1

2

3

4

5

6

7

8

9

444

888

Page 31: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Roadmap

31Introduction Method Experiments ConclusionProblem Related Work Future Work

• Problem• Algorithm•MIDAS•MIDAS-R

• Related Work • Experiments• Future Work

Page 32: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

MSTREAM

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 32Introduction Method Experiments ConclusionProblem Related Work

Challenges:

• High number of dimensions• Real-valued features• Correlation between features• Constant memory & time both w.r.t. stream length

and dimensionality

Future Work

Page 33: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

33

How to Contribute?

1. MIDAS-F: Filtering MIDAS

2. Parallel/Distributed/GPU version

3. Hardware implementation over FPGA

4. Knowledge Graphs (NLP)

5. Periodic setting

6. Erlang, F# implementations

Introduction Method Experiments ConclusionProblem Related Work Future Work

Page 34: MIDAS: Microcluster-Based Detector of Anomalies in Edge ...sbhatia/assets/pdf/midasslides.pdf · Introduction Problem MethodT3.1: Detecting Anomalous Dynamic SubgraphsRelated Work

Conclusion

T3.1: Detecting Anomalous Dynamic SubgraphsT3.1: Detecting Anomalous Dynamic Subgraphs 34Introduction Method Experiments ConclusionProblem Related Work

1. Streaming Microcluster Detection:• Constant time and memory

2. Theoretical Guarantees:• False Positive Probability

3. Effectiveness:• MIDAS is 42%-48% more accurate• MIDAS processes the data 162×-644× faster

Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin and Christos Faloutsos. “MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams.” AAAI Conference on Artificial Intelligence (AAAI), 2020. https://arxiv.org/abs/1911.04464

Future Work

https://github.com/Stream-AD/MIDAS/