![Page 1: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/1.jpg)
Gregex: GPU based High Speed Regular ExpressionMatching Engine
Date:101/1/11Publisher:2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous ComputingAuthor:Lei Wang, Shuhui Chen, Yong Tang, Jinshu SuPresenter : Shi-qu Yu
![Page 2: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/2.jpg)
INTRODUCTIONGregex, a Graphics Processing Unit
(GPU) based regular expression matching engine for deep packet inspection (DPI).
Gregex leverages the computational power and high memory bandwidth of GPUs by storing data in proper GPU memory space and executing massive GPU thread concurrently to process lots of packets in parallel
![Page 3: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/3.jpg)
THE PROPOSED GREGEX-Framework
![Page 4: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/4.jpg)
THE PROPOSED GREGEX-Framework
Matching result buffer is a single dimension array allocated in the global device memory; the size of the array is equal to the number of packets that are processed by GPU at a time
![Page 5: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/5.jpg)
THE PROPOSED GREGEX-Framework
![Page 6: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/6.jpg)
THE PROPOSED GREGEX-Workflow
pre-processing phasesignature matching phasepost-processing phase
![Page 7: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/7.jpg)
Pre-processing phaseCompiling regular expressions to DFA
Once the DFA has been constructed, the state transition table is copied to texture memory of GPU by two steps: 1. Copy state transition table from CPU memory to GPU global memory; 2. Bind the state transition table in global memory to texture cache.
Transferring packets to GPUGregex chooses to copy packets to device memory in batches.
![Page 8: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/8.jpg)
Signature matching phase
![Page 9: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/9.jpg)
Post-processing phaseWhen all GPU threads finish matching,
the matching result array is copied to the CPU memory. The kth cell of the matching result array contains the ID of the regular expression that matches the kth packet;if no match occurs, it is set to zero.
![Page 10: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/10.jpg)
Optimizations1) Asynchronous packets Transfer with
Page-locked memory(ATP):Asynchronous copy:using cudaMemcpyAsync
function is nonblocking transfers, control is returned immediately to the host.
thread.Zero copy: Zero copy requires mapped page-locked memory and enables GPU threads to directly access host memory.
![Page 11: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/11.jpg)
Optimizations2)Coalesced global memory access in
regular expression matchingCoalesced global memory Access by Buffering packets to shared Memory (CAB) In this work, coalesced global memory access is obtained by having each half warp reading contiguous locations of global memory to shared memory.
We use s packets which is a 32×32 shared memory array of 32-bit words, to ”buffer” packet from global memory for every thread.
![Page 12: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/12.jpg)
EVALUATION RESULTSPC with a 2.66 GHz Intel Core 2 Duo
processor, 4 GB memory and a NVIDIA GeForce GTX 260 GPU card. GTX260 GPU contains 216 SPs organized in 27 SMs, running at 1.35 GHz with 896 MB of global memory.
Gregex uses signatures in the rule set released with Snort 2.7. The rule set consists of 56 different signature sets.
![Page 13: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/13.jpg)
Packets Transfer Performance
![Page 14: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/14.jpg)
Regular Expression Matching Performance
![Page 15: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/15.jpg)
Regular Expression Matching Performance
![Page 16: Gregex : GPU based High Speed Regular Expression Matching Engine](https://reader036.vdocuments.site/reader036/viewer/2022081511/56816386550346895dd47012/html5/thumbnails/16.jpg)
Overall throughput of Gregex