specification-based anomaly detection: a new approach for detecting new intrusion r. sekar, a....

28
Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S. Zhou Stony Brook University 20080418, by Mike Hsiao ACM Conference on Computer and Communications Security (CCS), 2002

Upload: emil-knight

Post on 18-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

Specification-based Anomaly Detection: A New Approach for

Detecting New IntrusionR. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A.

Tiwari, H. Yang, and S. Zhou

Stony Brook University

20080418, by Mike Hsiao

ACM Conference on Computer and Communications Security (CCS), 2002

Page 2: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

2

Outline

• Introduction, Related Work• Overview, Benefits

• State-Machine Language• Specification Development• Anomaly Detection

• Sequence and statistical property, Detecting Anomaly

• Experimental Results• 1999 Lincoln, Email Virus

• Conclusions and Comments

Page 3: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

3

Introduction: IDS approaches

• Misuse Detection• detect attack as an instance of attack signature• inefficient against unknown attack

• Anomaly Detection• any deviations from normal system behavior

(profile) are flagged as potential attack• legitimate but previously unseen behavior may

exist

• Selecting appropriate signature or profile is a hard problem.

Page 4: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

4

Introduction: specification-based anomaly

• Detect attacks as deviations from a norm• Manually develop specification that capture

legitimate (rather than pervious seen) system behaviors• Avoid legitimate-but-unseen behavior• Time-consuming but decreased false negatives

• This paper, “specification-based anomaly detection”: combination of two approaches.

Page 5: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

5

Overview

• Develop specifications of hosts and routers in terms of packets received or transmitted by them.• derived from RFCs or other description of protocols

such as IP, ARP, TCP and UDP.

(a specification characterizing the gateway behavior)

Page 6: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

6

Overview: example• No IP fragmentation is modeled, and only packets from the

Interne (but not those sent to the Internet) are captured.• These packets may be destined for the gateway itself, in

which case the state machine makes a transition from the INIT to DONE state.

• Otherwise, a packet may be destined for an internal machine, in which case the gateway will first receive it on its external network interface, and make a transition from the INIT to PKT RCVD state.

• Next, it will relay the packet on its internal network interface, making a transition to the DONE state.

• Occasionally, the relay may not take place. We model such situations with a timeout transition from the PKT RCVD state to the DONE state.

a) the gateway could not resolve the MAC address corresponding to the IP address of the target machine,

b) the gateway machine is malfunctioning, etc.

Page 7: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

7

Overview: EFSA

• Extended Finite State Automata• transition events having arguments• using state variables storing values

• e.g., src, dst

• For each IP packet received on the external network interface, it create an instance of the IP state machine that is in the INIT state.

• Each of instances that can make a transition of a given packet is permitted to do so.

Page 8: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

8

Overview: statistical machine learning

• Based on learning the statistical properties associated with the IP-sate machine, the authors can detect several kinds of attacks.• the frequency with which a particular

transition in the EFSA is taken• the most commonly encountered value of a

particular control state of the EFSA• the distribution of values a state variables

Page 9: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

9

Overview: IP sweep• Typically, IP sweep specification can be like that with

fairly accuracy.• the number of different IP addresses for which packets were

received in the late t seconds

• In this paper,• The attacker does not know legitimate IP addresses in the

target domain.• This implies that several packets will be sent by the attacker

to nonexistent hosts which result in a sudden spurt of timeout transitions being taken.

• Thus, the statistics on the frequency of timeout transitions from the PKT RCVD state can serve as a reliable indicator of the IP sweep attack.

Page 10: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

10

Benefits

• provides accurate attack detection• detect known and unknown attacks,• low false alarm rates

• simplifies feature selections

• employs redundancy to improve attack detection

• support unsupervised learning

Page 11: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

11

State-machine language

• EFSA M = (∑, Q, s, f, V, D, d)• ∑: event• Q: finite set of states• s: the start state• f: the finial state

• V: a finite tuple (v1, …, vn) of state variables

• D: a finite tuple (D1, …, Dn) denote the domain of values for the state variable

• d: Q x D x ∑-> (Q, D) is the transition relation

Page 12: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

12

State Machine

Specificationevent(x1, …,xn)|cond -> action

Page 13: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

13

State-machine: non-deterministic

• In general, protocol state machine are non-deterministic• It can make one of k different transitions.• They clone k copies of the state machine

whenever it can make one of k different transitions.

• They delete the state machine instances that are reach their finial state.

• Finial states are some what different from “accepting states” of an FSA - they are similar to sink states.

Page 14: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

14

State-machine: instances

• There can be many instances of a state machine at runtime.• For each incoming event, they may have to search

through all of these instances.• “Sessions”:

• map event(eventArgs) when condition• condition is a conjunction tests• left-hand side is event arguments, right-hand side is state

variables

• map rx(ifc, pkt) when (ifc == ext)• The condition can implement a hash-table lookup of state

machine instance ID.

Page 15: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

15

Specification Development

• They only capture the essential deteails of most protocols.• developing precise specifications would entail

more effort.• there might be minor difference in implementation.

• Fig.3, A specification of the TCP state machine, as observed on a gateway connecting an organization’s internal network to the Internet.

Page 16: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

16

Figure 3: TCP Protocol State Machine. (Certain abnormal transitions are not shown.)

Page 17: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

17

Anomaly Detection: mapping packet sequence properties to properties of transitions

• Traces: corresponding to a path in the state machine• rx(ext, pkt)• rx(ext, ptk1) rx(int, pkt2)• rx(ext, pkt1)

• A trace has fewer properties than long packet sequences.

• A trace provide concrete clues• unexpected packets, absence of expected

packets, timeout event.

Page 18: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

18

Anomaly Detection: Two categories of properties

• Type 1: whether a particular transition on the state machine is taken by a trace.• Example: is the timeout transition taken by a trace?

• Type 2: the value of a particular state variable or a packet field when a transition is traversed by a trace.• Example: what is the size of IP packet when the transition

from INIT to PKT RCVD state is taken?

• More complex properties that involve multiple transitions• e.g.,whether a trace traverses a particular combination of

transitions

Page 19: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

19

Anomaly Detection: learning statistical properties

• how frequently a transition is taken (for type 1), or the encountered values of state variables on a transition (for type 2)• use distribution, rather than average• use recent traces, rather than long time in the past,

or• use traces from interested host and/or to a particular

host, or all fragmented packets, e.g.,• on all frequency timescale (0.001, 0.002, 0.5, 10, 100, 1000)

• on all frequency wrt(src) size 100 timescale (0.001, 0.002, 0.5, 10, 100, 1000)

Page 20: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

20

Anomaly Detection: detecting anomalies

• If the statistics (in detection phase) vary substantially from what was learnt, then an anomaly is raised.• They are currently investigating ways to

precisely control what is considered “substantial difference.”

• Meanwhile, their implementation uses a simple thresholding scheme.

Page 21: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

21

Experimental Results: Lincoln 1999

• Their experiments have focused on attacks on lower layers of protocols such as IP and TCP, due to the fact that they have so far developed state machine models of only these two protocols.

• Since their approach recognizes anomalies based on repetition, at least two packets must be involved in an attack before the attack can be expected to be detected by their approach.• They remove six attacks that only need one packets,

and some instances without complete TCP traces.

Page 22: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

22

Experimental Results: results

• Excellent attack detection• All of the attacks within the scope of the prototype

were detected.• Their approach has no knowledge about sweeps

encoded into it.

• Low false positives• 5.5 false alarms per day

• Adequate processing capacity• excluding I/O time, 700MB data within ten minutes

on a 700MHz Pentium III with 1 GB memory.

Page 23: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

23

Experimental Results: Attacks detected by IP machine

• ts = (0.001, 0.01, 0.1, 1, 10, 100 and 1000)• 1) on all frequency timescale ts• 2) on all frequency wrt (src) size 100 timescale ts• 3) on all frequency wrt (dst) size 100 timescale ts• 4) on all frequency wrt (src, dst) size 100 timescale ts

• IP sweep(by using statistic in 2, 1), Ping to Death(3, 2, 4), Smurf(1, 3) can be detected.

Page 24: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

24

Experimental Results: Attacks detected by TCP machine

• 5) on all frequency timescale ts• 6) on all frequency wrt (ext_ip) size 1000

timescale ts• 7) on all frequency wrt (int_ip) size 1000

timescale ts• 8) on all frequency wrt (ext_ip, int_ip)

size 1000 timescale ts• 9) on all frequency wrt (int_ip, int_port)

size 1000 timescale ts• 10) on all frequency wrt (ext_ip, int_ip,

int_port) size 1000 timescale ts• 11) on all frequency wrt (ext_ip, ext_port,

int_ip, int_port) size 1000 timescale ts

Page 25: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

25

Experimental Results: Attacks detected by TCP machine (cont’d)

• Portsweep (7, 8)• Quwso (abnormal transition, LISTEN to LISTEN)• Neptune (SYN Flood, 6-11)• Satan/Saint (similar to portsweep)• Mscan (similar to portsweep)• Mailbomb (sending number of email to overflow the

server mail queue, 7-11)• Apache2 (sending large number of MIME headers,

increase the frequency of packets received at ESTABLISH, 9,10)

• Back (DoS, similar to Apache2)

Page 26: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

26

Experimental Results: Email Virus Propagation in an intranet

• ts = (10, 30, 120, 500, 2000, 8000, 25000)• 1) on all frequency timescale ts• 2) on all frequency wrt (sender) timescale ts

• 400 email clients and one sendmail server, hundreds of runs about 10 different virus.

• Their approach can detect all virus, which other defense mechanism lost 7 runs.

Page 27: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

27

Conclusuions

• Specification-based anomaly detection.• benefit from both approach

• Simply monitoring the frequency distribution information associated with state machine transitions

• Specifications can be easily extended.

Page 28: Specification-based Anomaly Detection: A New Approach for Detecting New Intrusion R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S

28

Comments

• Construction of FSA is too vague• IP protocol can not directly map the Figure 1. It relies on the

knowledge of gateway operation, so as Email virus propagation.

• The learning mechanism is good, but still relies on the traditional concept of anomaly detection (frequency or distribution)

• We use the deviation of protocol behavior as a basis, and construct the FSM for all the temporal status leading to the abnormality.

• We focus on the exploiting phase, rather than probing, scanning, or propagation.

• We have inference model for attack assessment.