variable-stride multi-pattern matching for scalable deep packet inspection

35
Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection Author: Nan Hua, Haoyu Song, T. V. Lakshman Publisher: INFOCOM 2009 Presenter: Chun-Yi Li Date: 2009/04/22

Upload: claire-pittman

Post on 30-Dec-2015

41 views

Category:

Documents


0 download

DESCRIPTION

Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection. Author: Nan Hua, Haoyu Song, T. V. Lakshman Publisher: INFOCOM 2009 Presenter: Chun-Yi Li Date: 2009/04/22. Outline. Related Work Winnowing Algorithm Variable-Stride DFA Algorithm Optimizations Performance. - PowerPoint PPT Presentation

TRANSCRIPT

Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection

Author: Nan Hua, Haoyu Song, T. V. Lakshman

Publisher:INFOCOM 2009

Presenter: Chun-Yi Li

Date: 2009/04/22

2

Related Work Winnowing Algorithm

Variable-Stride DFA

Algorithm Optimizations

Performance

Outline

3

Related Work

s e r v e r [ p a t h ]

s e r v e r [ p a t h ]

177 87 210 119 87 178 63 92 57 56 71

Winnowing with k = 2 and w = 3

Winnowing Algorithm

1. Calculate the hash value of every consecutive k characters.2. Use a sliding window of size w to select the minimum hash value in

the window.A tie is broken by selecting the rightmost minimum value.

delimiter

4

Related Work Winnowing Algorithm

Variable-Stride DFA

Algorithm Optimizations

Performance

Outline

5

Variable-Stride DFA

Segmentation Scheme Properties

Property 1: • The size of any segmented block is in the range [1, w].• Tail block sizes are in the range [k−1, w+k−2].• Indivisible pattern sizes are in the range [1, w+k − 2].• Coreless pattern sizes are in the range [w+k−1, 2w+k−2].

Coreless patternIndivisible pattern

6

Variable-Stride DFA

Segmentation Scheme PropertiesProperty 2:

If a pattern appears in a data stream then segmenting the data stream results in exactly the same delimiters for the core blocks of the pattern.

The head block can be affected by the preifix and the tail block can be affected by the suffix. However, the core blocks are totally confined to the pattern and isolated from the context.

ex: input stream: ...A|BCh|ij|kl|m|nD|EF|...pattern: hij|kl|mn

7

Variable-Stride DFA

Finite Automaton Construction

patternhead string core string

tail string

s1 ridiculous r id ic ulo u s

s2 authenticate auth ent ica te

s3 identical id ent ica l

s4 confident conf id ent

s5 confidential conf id ent ial

s6 entire ent (empty) ire

s7 set --- (indivisible) ---

quasi-match state

8

Variable-Stride DFA

System Design and Basic Data Structure

9

Variable-Stride DFA

System Design and Basic Data Structure

State Transition Table(STT) Match Table(MT)

Hash Key Value

Start State block End

State

q0 id q14

q0 ent q1

q14 ic q2

q2 ulo q3

q3 u q11

q14 ent q15

q1 ica q12

q15 ica q12

Match TableState Head Tail Depth

q11r s 4

q12auth te 2

q12id l 2

q14conf ent 1

q15conf ial 2

10

Variable-Stride DFA

System Design and Basic Data Structure

Head Queue(HQ)

w bytes

D entries(D is the length of the longest forwarding path of the VS-DFA)

To enable match verification on the Quasi-match states, we need to maintain a Head Queue (HQ) that remembers the Block-matching history.

11

Variable-Stride DFA

System Design and Basic Data Structure

Match Table

State Head Tail Depth

q11 r s 4

q12 auth te 2

q12 id l 2

q14 conf ent 1

q15 conf ial 2

ex:

Data Stream: ‥‥A|BCr|id|ic|ulo|u|sD|EF‥‥

Head Queue(HQ)

u l o u

c u l o

i d i c

C r i d

A B C r

01234

12

Variable-Stride DFA

System Design and Basic Data Structure

Match Table

State Head Tail Depth

q11 r s 4

q12 auth te 2

q12 id l 2

q14 conf ent 1

q15 conf ial 2

ex:

Data Stream: ‥ ‥ABCD|Eau|th|ent|ica|te ‥ ‥

Head Queue(HQ)

t i c a

h e n t

a u t h

D E a u

A B C D

01234

13

Variable-Stride DFA

Short Pattern Handling

Headw byte

Tailw+k-2 byte

* e n t i r e *s e t * * * * ** s e t * * * ** * s e t * * ** * * s e t * *

Coreless Pattern

Indivisible Pattern

Using TCAM for short pattern lookups

14

Related Work Winnowing Algorithm

Variable-Stride DFA

Algorithm Optimizations

Performance

Outline

15

Algorithm Optimizations

Reducing Single-Byte Blocks

It is possible to generate specific inputs that result in only single-byte streams being produced independent of the chosen hash functions and window parameters.

16

Algorithm Optimizations

Combination Rule 1 (applied on data stream)

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 ‥‥

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 ‥‥

w = 3

17

Algorithm Optimizations

Combination Rule 1 (applied on pattern)

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

window size w = 3

Step 1:

18

Algorithm Optimizations

Combination Rule 1 (applied on pattern)

1.

2.

3.

4.

5.

6.

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

window size w = 3

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Step 2:

Replicate

19

Algorithm Optimizations

Combination Rule 1 (applied on pattern)data stream:

pattern 1:

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

a1 a2 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

window size w = 3

a1 a2 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Applying Combination Rule 1

20

Algorithm Optimizations

Combination Rule 1 (applied on pattern)data stream:

pattern 2:

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

‥ a1 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

window size w = 3

‥ a1 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Applying Combination Rule 1

21

Algorithm Optimizations

Combination Rule 1 (applied on pattern)data stream:

pattern 3:

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

window size w = 3

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Applying Combination Rule 1

22

Algorithm Optimizations

Combination Rule 1 (applied on pattern)data stream:

pattern 4:

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

a1 a2 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

window size w = 3

a1 a2 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Applying Combination Rule 1

23

Algorithm Optimizations

Combination Rule 1 (applied on pattern)data stream:

pattern 5:

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

‥ a1 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

window size w = 3

‥ a1 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Applying Combination Rule 1

24

Algorithm Optimizations

Combination Rule 1 (applied on pattern)data stream:

pattern 6:

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

window size w = 3

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Applying Combination Rule 1

25

Algorithm Optimizations

Combination Rule 2 (applied on data stream)

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 ‥‥

‥‥ c1 c2 c3 c4 c5 c6 c7 c8 c9 ‥‥

window size w’= w+1 = 3+1 = 4

Applying Combination Rule 2

26

Algorithm Optimizations

Combination Rule 2 (applied on pattern)

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

window size w’= w+1 = 3+1 = 4

Step 1:

27

Algorithm Optimizations

Combination Rule 2 (applied on pattern)

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

window size w’= w+1 = 3+1 = 4

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Replicate

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

1.

2.

Step 2:

28

Algorithm Optimizations

Combination Rule 2 (applied on pattern)

window size w’= w+1 = 3+1 = 4

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

‥c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 a1 ‥

‥c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 a1 ‥

Applying Combination Rule 2

pattern 1:

29

Algorithm Optimizations

Combination Rule 2 (applied on pattern)

window size w’= w+1 = 3+1 = 4

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15

Match

‥c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 ‥ ‥

‥c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 ‥ ‥

Applying Combination Rule 2

pattern 2:

Three STTs Design

Algorithm Optimizations

Hash Key Value

id q’1

ent q’6

q’1 (q14) ->s4

q’2 (q2) uloq’3 (q3) uq’4 (q11) ->s1

q’5 (q15) ->s5

q’6 (q1) icaq’7 (q12) ->s2

Hash Key Value

q’1 ic q’2

q’1 ent q’4

q’5 ica q’6

Start STT Main STT Jump STT

31

Related Work Winnowing Algorithm

Variable-Stride DFA

Algorithm Optimizations

Performance

Outline

Performance

Mem1 denotes the memory consumed by “Start STT”Mem2 denotes that for “Three STT”.

Performance

Fixed:patterns extracted from the fixed string rules.Full: the expanded pattern sets that also include the fixed strings extracted from the regular expression rules.

Performance

SNORT-fixed ClamAV-fixed

Performance