shock: a worst-case ensured sub-linear time pattern matching algorithm for inline anti-virus...
TRANSCRIPT
SHOCK: A Worst-Case EnsuredSub-linear Time Pattern Matching Algorithm for Inline Anti-Virus Scanning
Author:
Nen-Fu Huang, Wen-Yen Tsai
Publisher: IEEE ICC ,2010
Presenter: Kai-Yang, Liu
Date: 2012/1/4
INTRODUCTION•Challenges of an inline multi-pattern
matching algorithm:
Must be fast enough to scan millions of packets in the gigabit environment.
It is desirable for small memory footprint of the algorithm to scale well for the ever-growing virus patterns.
must perform well under a high volume of virus-infected traffic to avoid becoming the bottleneck.
2
ClamAV•ClamAV provides an anti-virus engine and
a regularly updated virus database.•ClamAV virus signatures can be classified
as one of the four categories: basic, regular expression (regex), MD5, and others.
3
The Proposed SHOCK Algorithm
•SHOCK(Shift/Hash with Overlap Check) algorithm consists of an offline preprocessing phase and an online pattern matching phase.
•The shift table is constructed using the same approach as in the WM algorithm with block size two and we calculate the hash value of the 2-byte prefix of each pattern.
5
•When a matched pattern is found, there may be another consecutive pattern in the text with prefix overlapping suffix of the currently matched one.
7
•For a pattern to be stored in the nextPat list of the current pattern, the number of its prefix characters which overlap suffix of the current pattern must be greater than or equal to PSP_TH.
9
•Although only a quite small number of patterns has long nextPat list when PSP_TH = 8 ,they must be specially handled to avoid the worst-case scenario.
10
Bitmap-offset-indexing structure
•only for those patterns with nextPat list length greater than the parameter, PBMAP_TH
11