learning from sequential data for anomaly detection 349793/fulltext.pdf learning from sequential...

Download Learning from sequential data for anomaly detection 349793/fulltext.pdf LEARNING FROM SEQUENTIAL DATA

Post on 28-Dec-2019

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • LEARNING FROM SEQUENTIAL DATA FOR ANOMALY

    DETECTION

    A Dissertation Presented

    by

    Esra Nergis Yolaçan

    to

    The Department of Electrical and Computer Engineering

    in partial fulfillment of the requirements

    for the degree of

    Doctor of Philosophy

    in

    Electrical and Computer Engineering

    in the field of

    Computer Engineering

    Northeastern University

    Boston, Massachusetts

    October 2014

  • c© Copyright 2015 by Esra Nergis Yolaçan

    All Rights Reserved

    ii

  • Abstract

    Anomaly detection has been used in a wide range of real world problems and has

    received significant attention in a number of research fields over the last decades.

    Anomaly detection attempts to identify events, activities, or observations which are

    measurably different than an expected behavior or pattern present in a dataset. This

    thesis focuses on a specific set of techniques targeting the detection of anomalous

    behavior in a discrete, symbolic, and sequential dataset. Since profiling complex

    sequential data is still an open problem in anomaly detection, and given that the rate

    of production of sequential data in fields ranging from finance to homeland security

    is exploding, there is a pressing need to develop effective detection algorithms that

    can handle patterns in sequential information flows.

    In this thesis, we address context-aware multi-class anomaly detection as applied

    to discrete sequences and develop a context learning approach using an unsupervised

    learning paradigm. We begin the anomaly detection process by applying our approach

    to differentiate normal behavior classes (contexts) before attempting to model normal

    behavior. This approach leads to stronger learning on each class by taking advantage

    of the power of advanced models to identify normal behavior of the sequence classes.

    We evaluate our discrete sequence-based anomaly detection framework using two

    illustrative applications: 1) System call intrusion detection and 2) Crowd anomaly

    iii

  • detection. We also evaluate how clustering can guide our context-aware methodology

    to positively impact the anomaly detection rate.

    In this thesis, we utilize a Hidden Markov Model (HMM) to perform anomaly

    detection. A HMM is the simplest dynamic Bayesian network. A HMM is a Markov

    model which can be used when the states are not observable, but observed data is

    dependent on these hidden states. While there has been a large amount of prior

    work utilizing Hidden Markov Models (HMMs) for anomaly detection, the proposed

    models became overly complex when attempting to improve the detection rate, while

    reducing the false detection rate.

    We apply HMMs to perform anomaly detection on discrete sequential data. We

    utilize multiple HMMs, one for each context class. We demonstrate our multi-HMM

    approach to system call anomalies in cyber security and provide results in the presence

    of anomalies. Applying process trace analysis with multi-HMMs, system call anomaly

    detection achieves better results using better tuned model settings and a less complex

    structure to detect anomalies.

    To evaluate the extensibility of our approach, we consider a second application,

    crowd behavior analytics. We attempt to classify crowd behavior and treat this as an

    anomaly detection problem on sequential data. We convert crowd video data into a

    discrete/symbolic sequence of data. We apply computer vision techniques to generate

    features from objects, and use these features for frame-based representations to model

    the behavior of the crowd in a video stream. We attempt to identify anomalous

    behavior of a crowd in a scene by applying machine learning techniques to understand

    what it means for a video stream to be identified as “normal”. The results of applying

    our context-aware multi-HMMs approach to crowd analytics show the generality of

    our anomaly detection approach, and the power of our context-learning approach.

    iv

  • Acknowledgements

    In the name of God, the Most Gracious, the Most Merciful.

    .

    I dedicate this thesis to my beloved husband, Riza, from the depths of my heart

    and soul. You have supported me throughout everything, I could not have accom-

    plished this without you. Thank you for your remarkable patience and unwavering

    love during this doctoral journey. To my loving parents, Nermin and Feridun. You

    both have successfully made me the person I am becoming by instilling the impor-

    tance of hard work and higher education. Thank you for being my inspiration and

    a wonderful role model. To my precious brother, Emre. You have always been there

    cheering me up and stood by me through the good times and bad. Thank you for

    never ending motivations and always believing in me. I am grateful to all four of you

    for your presence and I love you more than you will ever know. Thanks for all of your

    endless love, support and encouragement.

    I would like to express my gratitude to my advisor, Prof. Dr. David R. Kaeli,

    for his guidance, understanding, and patience during five years of my dissertation.

    Thank you for being so supportive by giving advices, providing persistent help and

    encouraging me in order to complete this task. I would also like to thank my com-

    mittee members, Prof. Dr. Jennifer G. Dy, and Dr. Fatemeh Azmandian for their

    v

  • precious time and guidance throughout this dissertation. Your thoughtful comments

    were invaluable. I thank all my dear colleagues at NUCAR group for valuable dis-

    cussions, suggestions, and the most importantly, their friendship during my studies

    at Northeastern University. I would like to thank Ayse Yilmazer, for being with me

    and sharing her experiences during my research. Additionally, I would like to thank

    our graduate coordinator Faith Crisley for her advice and assistance during the years

    of my study.

    vi

  • Contents

    Abstract ii

    Acknowledgements iv

    1 Introduction 1

    1.1 Anomaly Detection on Sequential Data . . . . . . . . . . . . . . . . . 2

    1.2 Challenges of Working with Sequential Data . . . . . . . . . . . . . . 9

    1.3 Contributions of the Work . . . . . . . . . . . . . . . . . . . . . . . . 14

    1.4 Organization of Dissertation . . . . . . . . . . . . . . . . . . . . . . . 16

    2 Background 18

    2.1 Intrusion Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.1.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.1.2 Detection Method . . . . . . . . . . . . . . . . . . . . . . . . . 20

    2.1.3 Response Type . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.2 Crowd Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.2.1 People Counting/Density Estimation . . . . . . . . . . . . . . 24

    2.2.2 People Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.2.3 Behavior Learning . . . . . . . . . . . . . . . . . . . . . . . . 26

    vii

  • 2.3 Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    2.3.1 Anomaly Detection Algorithms . . . . . . . . . . . . . . . . . 29

    2.3.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . 30

    2.3.3 Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . 31

    2.3.4 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . 34

    3 Related Work 40

    3.1 Related Work in System Call Analysis . . . . . . . . . . . . . . . . . 40

    3.1.1 Data Representation in System Call Analysis . . . . . . . . . . 41

    3.1.2 HMM in System Call Analysis . . . . . . . . . . . . . . . . . . 43

    3.2 Related Work in Crowd Analysis . . . . . . . . . . . . . . . . . . . . 46

    3.2.1 Data Representation in Crowd Analysis . . . . . . . . . . . . . 46

    3.2.2 HMM in Crowd Analysis . . . . . . . . . . . . . . . . . . . . . 49

    3.3 Related Work in Context-aware Systems . . . . . . . . . . . . . . . . 51

    3.3.1 Context-aware Applications . . . . . . . . . . . . . . . . . . . 52

    3.3.2 Context Inference . . . . . . . . . . . . . . . . . . . . . . . . . 54

    4 Context Learning 56

    4.1 Context in a Symbolic Sequential Data . . . . . . . . . . . . . . . . . 56

    4.2 Clustering for Context Learning . . . . . . . . . . . . . . . . . . . . . 57

    4.3 Parameter Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    4.3.1 Number of Clusters . . . . . . . . . . . . . . . . . . . . . . . . 60

    4.3.2 Required Length (Number of Symbols) . . . . . . . . . . . . . 63

    4.4 Summary of Context Learning . . . . . . . . . . . . . . . . . . . . . . 68

    5 System Call Anomaly Detection 70

    5.1 System Call Trace Dataset . . . . . . . . . . . . . . . . . . . . . . . . 72

    viii

  • 5.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    5.2.1 Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    5.2.2 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    5.2.3 Clustering for Context Learning . . . . . . . . . . . . . . . . . 75

    5.2

Recommended

View more >