quantitative network monitoring with netqrealur/sigcomm17.pdf · 2017. 6. 26. · in network...

14
antitative Network Monitoring with NetQRE Yifei Yuan University of Pennsylvania [email protected] Dong Lin LinkedIn Inc. [email protected] Ankit Mishra University of Pennsylvania [email protected] Sajal Marwaha University of Pennsylvania [email protected] Rajeev Alur University of Pennsylvania [email protected] Boon Thau Loo University of Pennsylvania [email protected] ABSTRACT In network management today, dynamic updates are required for trac engineering and for timely response to security threats. De- cisions for such updates are based on monitoring network trac to compute numerical quantities based on a variety of network and application-level performance metrics. Today’s state-of-the-art tools lack programming abstractions that capture application or session-layer semantics, and thus require network operators to specify and reason about complex state machines and interactions across layers. To address this limitation, we present the design and implementation of NetQRE, a high-level declarative toolkit that aims to simplify the specication and implementation of such quan- titative network policies. NetQRE integrates regular-expression-like pattern matching at ow-level as well as application-level payloads with aggregation operations such as sum and average counts. We describe a compiler for NetQRE that automatically generates an ef- cient implementation with low memory footprint. Our evaluation results demonstrate that NetQRE allows natural specication of a wide range of quantitative network tasks ranging from detecting se- curity attacks to enforcing application-layer network management policies. NetQRE results in high performance that is comparable with optimized manually-written low-level code and is signicantly more ecient than alternative solutions, and can provide timely enforcement of network policies that require quantitative network monitoring. CCS CONCEPTS Networks Network monitoring; Programmable networks; KEYWORDS NetQRE, network monitoring language, quantitative regular ex- pression ACM Reference format: Yifei Yuan, Dong Lin, Ankit Mishra, Sajal Marwaha, Rajeev Alur, and Boon Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In Proceed- ings of SIGCOMM ’17, Los Angeles, CA, USA, August 21-25, 2017, 14 pages. DOI: 10.1145/3098822.3098830 Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from [email protected]. SIGCOMM ’17, Los Angeles, CA, USA © 2017 ACM. 978-1-4503-4653-5/17/08. . . $15.00 DOI: 10.1145/3098822.3098830 1 INTRODUCTION Network management today often requires dynamic updates in response to trac engineering and security events. For example, in data centers, heavy hitters [8, 40] need to be detected in real-time and bandwidth limits may be imposed on them. Within an enter- prise network, users can be rate-limited if they exceed quotas on their application usage. Network trac anomalies that are detected require immediate mitigation strategies to block potential security attacks [22]. Decisions for such updates require quantitative network moni- toring capabilities, which is a combination of monitoring a variety of network and application-layer performance metrics with known trac patterns, and the real-time computation of quantitative ag- gregate values that are used as a basis for network conguration updates in order to meet performance and security goals. Imperative languages provide low-level abstractions, which makes it cumbersome to perform complex quantitative analysis on packet streams. To illustrate this dicult, we consider a quantitative net- work monitoring task of detecting two well-known denial-of-service attacks: Slowloris [36] and the SSL renegotiation attack [6]. The former attack requires tracking TCP connections, and raising an anomaly when the number of bytes transferred is unusually lower than normal. The latter attack requires identifying trac signature patterns for TLS renegotiations, and counting the number of such renegotiations across multiple HTTP sessions. Detecting both at- tacks require identifying HTTP or TCP trac in packet streams, extracting out attack patterns, and monitoring quantitative values to identify possible anomalies. As new packets arrive, state has to be incrementally updated as TCP sessions are established and torn down. A more natural and intuitive approach for a programmer to detect these attacks is to provide a framework to express these monitoring tasks in a modular way, i.e., dene the pattern of a TCP connection or HTTP request, recognizing data transfer packets for the connection or TLS renegotiations, and then aggregating trac rates or renegotiation counts as connections are established. Quantitative network monitoring is not only restricted to se- curity use cases, but are also useful for enforcing network man- agement policies based on a given application-level metric. For example, we consider a policy that aims to enforce a particular bandwidth quota for a given application (e.g. VoIP) on a per-user basis. This policy requires identifying VoIP trac in packet streams, manually maintaining state across packets to detect the start of each VoIP session, extracting out user information, and monitoring aggregate VoIP usage per user even as their IP address is updated. Supporting such monitoring functionality requires identifying VoIP

Upload: others

Post on 04-Sep-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQREYifei Yuan

University of Pennsylvaniayifeiycisupennedu

Dong LinLinkedIn Inc

dolinlinkedincom

Ankit MishraUniversity of Pennsylvania

mankitseasupennedu

Sajal MarwahaUniversity of Pennsylvania

sajalmseasupennedu

Rajeev AlurUniversity of Pennsylvania

alurcisupennedu

Boon Thau LooUniversity of Pennsylvaniaboonlooseasupennedu

ABSTRACTIn network management today dynamic updates are required fortrac engineering and for timely response to security threats De-cisions for such updates are based on monitoring network tracto compute numerical quantities based on a variety of networkand application-level performance metrics Todayrsquos state-of-the-arttools lack programming abstractions that capture application orsession-layer semantics and thus require network operators tospecify and reason about complex state machines and interactionsacross layers To address this limitation we present the design andimplementation of NetQRE a high-level declarative toolkit thataims to simplify the specication and implementation of such quan-titative network policies NetQRE integrates regular-expression-likepattern matching at ow-level as well as application-level payloadswith aggregation operations such as sum and average counts Wedescribe a compiler for NetQRE that automatically generates an ef-cient implementation with low memory footprint Our evaluationresults demonstrate that NetQRE allows natural specication of awide range of quantitative network tasks ranging from detecting se-curity attacks to enforcing application-layer network managementpolicies NetQRE results in high performance that is comparablewith optimized manually-written low-level code and is signicantlymore ecient than alternative solutions and can provide timelyenforcement of network policies that require quantitative networkmonitoring

CCS CONCEPTSbullNetworksrarrNetworkmonitoringProgrammable networks

KEYWORDSNetQRE network monitoring language quantitative regular ex-pression

ACM Reference formatYifei Yuan Dong Lin Ankit Mishra Sajal Marwaha Rajeev Alur and BoonThau Loo 2017 Quantitative Network Monitoring with NetQRE In Proceed-ings of SIGCOMM rsquo17 Los Angeles CA USA August 21-25 2017 14 pagesDOI 10114530988223098830

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor prot or commercial advantage and that copies bear this notice and the full citationon the rst page Copyrights for components of this work owned by others than ACMmust be honored Abstracting with credit is permitted To copy otherwise or republishto post on servers or to redistribute to lists requires prior specic permission andor afee Request permissions from permissionsacmorgSIGCOMM rsquo17 Los Angeles CA USAcopy 2017 ACM 978-1-4503-4653-51708 $1500DOI 10114530988223098830

1 INTRODUCTIONNetwork management today often requires dynamic updates inresponse to trac engineering and security events For example indata centers heavy hitters [8 40] need to be detected in real-timeand bandwidth limits may be imposed on them Within an enter-prise network users can be rate-limited if they exceed quotas ontheir application usage Network trac anomalies that are detectedrequire immediate mitigation strategies to block potential securityattacks [22]

Decisions for such updates require quantitative network moni-toring capabilities which is a combination of monitoring a varietyof network and application-layer performance metrics with knowntrac patterns and the real-time computation of quantitative ag-gregate values that are used as a basis for network congurationupdates in order to meet performance and security goals

Imperative languages provide low-level abstractions which makesit cumbersome to perform complex quantitative analysis on packetstreams To illustrate this dicult we consider a quantitative net-work monitoring task of detecting two well-known denial-of-serviceattacks Slowloris [36] and the SSL renegotiation attack [6] Theformer attack requires tracking TCP connections and raising ananomaly when the number of bytes transferred is unusually lowerthan normal The latter attack requires identifying trac signaturepatterns for TLS renegotiations and counting the number of suchrenegotiations across multiple HTTP sessions Detecting both at-tacks require identifying HTTP or TCP trac in packet streamsextracting out attack patterns and monitoring quantitative valuesto identify possible anomalies As new packets arrive state has tobe incrementally updated as TCP sessions are established and torndown A more natural and intuitive approach for a programmerto detect these attacks is to provide a framework to express thesemonitoring tasks in a modular way ie dene the pattern of a TCPconnection or HTTP request recognizing data transfer packets forthe connection or TLS renegotiations and then aggregating tracrates or renegotiation counts as connections are established

Quantitative network monitoring is not only restricted to se-curity use cases but are also useful for enforcing network man-agement policies based on a given application-level metric Forexample we consider a policy that aims to enforce a particularbandwidth quota for a given application (eg VoIP) on a per-userbasis This policy requires identifying VoIP trac in packet streamsmanually maintaining state across packets to detect the start ofeach VoIP session extracting out user information and monitoringaggregate VoIP usage per user even as their IP address is updatedSupporting such monitoring functionality requires identifying VoIP

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

session for each user from input packet streams monitor usage foreach identied VoIP session and aggregating usage across all VoIPsessions grouped-by all users being tracked

Today there are several point-solutions to support certain as-pects of quantitative network monitoring However they suerfrom one or more of the following limitations First a large ma-jority of network measurement tools focus on ow-level measure-ments [15 18 19 39ndash41] that do not capture application-level orsession-level semantics This excludes a range of policies that areapplication-dependent for example tracking security signatures atthe application level or rate-limiting users based on their VoIP callusage Second these tools tend to provide ad-hoc solutions that aredicult to generalize or customize Programming frameworks forsoftware-dened networks (SDN) [21 30 32] do not support inte-gration with queries beyond basic ow-level counters and none ofthese languages support quantitative monitoring at the applicationor session level Finally tools such as Bro [33] requires networkoperators to have signicant programming expertise to implementstate machines and reason about state transitions

To address the above limitations we present NetQRE a practicaltool aimed at simplifying the specication and implementation ofquantitative network policies Our proposal is based on the observa-tion that trac patterns such as a TCP connection and a application-level session can be specied using regular expressions which arean abstraction that network operators who may not be well-versedin programming nd natural to use Regular expressions are widelyused in systems management for example in the popular grep toolfor searching for patterns in systems logs application-level packetclassication [1] signature-based attack detection [3 34] We makethe following contributionsbullNetQRE language We present the NetQRE declarative languagethat provides high-level abstractions and express a variety of quan-titative policies that span multiple packets grouped into ows andapplication-level sessions NetQRE is based on novel theoreticalfoundation of parameterized quantitative regular expressions (PQRE)PQRE extends prior work on QRE [9] which provides a formal foun-dation of combining traditional regular expressions with numericalcomputations NetQRE integrates regular-expression-like patternmatching at ow-level and application-level payloads with aggre-gation operations such as sum and average counts The languagefurther supports quantitative network policies by allowing actionson packets to be generated as output of monitoring applicationsbull Compilation and ecient runtime system We developed acompiler that can automatically generate ecient NetQRE imple-mentations with low memory footprint As part of the compilationprocess the compiler automatically infers the state that needs tobe maintained for NetQRE programs and optimizes the compiledimperative code A NetQRE runtime then executes the generatedcode eciently In fact this process also eases the burden on pro-grammers having to handwrite optimized versions in low-levelcodebullNetQRE implementation and evaluation We have developeda NetQRE prototype which we evaluate over a range of quantita-tive network monitoring tasks Our evaluation results demonstratethat NetQRE can express a wide range of quantitative networkpolicies with no more than 18 lines of code while specifying them

in imperative languages requires at least 100 lines of code Thecompiled implementations incur no more than 9 overhead inthroughput compared with optimized manually-written low-levelcode are able to handle more than 10Gbps of trac using a singlecore (and is amenable to parallelization) and are signicantly moreecient (eg 11times) compared to monitoring solutions based on Broand OpenSketch [40]

2 OVERVIEW

runtime

raw packets

Policy specification in NetQRE

NetQREcompiler Policy

implementation

Network

packets actions

updates

Figure 1 NetQRE Architecture

Fig 1 shows the architecture of NetQRE To program a quantita-tive network monitoring task a programmer simply species thismonitoring task in a declarative fashion by viewing the input as astream of packets The NetQRE compiler automatically generates ef-cient low-level imperative code that implements the specicationEach incoming raw packet from the network is rst parsed and pre-processed by the runtime into a form that can be referenced by themonitoring implementation In our deployment NetQRE relies onthe underlying runtime to handle packet reordering and retries (ifneeded) due to losses in TCP connections and also fragmentationand defragmentation of IP packets The stream of processed packetsare then sent to the appropriate NetQRE program for execution

As the NetQRE compiled program executes the incoming pack-ets are processed in a streaming fashion The output of a NetQREprogram is the generation of analysis results or actions taken torecongure the network The NetQRE tool can be deployed in dier-ent settings for example a tap on a SPAN port analyzing mirroredtrac an inline solution or running in the cloud as a virtualizedmiddlebox Our language design and compiler is agnostic to thedeployment setting While we focus on packet stream processingin this paper the NetQRE language and compilation is agnostic tothe input stream and can handle any item streams (eg raw packetsemail messages video frames)

21 Motivating ExampleAs a motivating example we consider a network management pol-icy where an enterprise monitors the usage of Voice-over-IP (VoIP)for each user and alert the user whose usage is signicantly higherthan the average usage over all users The actual language specica-tion will be described in later sections We provide a high-level intu-ition of programming language features necessary to support thispolicy Note that our examples are not solely limited to VoIP Sec-tion 4 provides more examples to showcase the wide-applicabilityof NetQRE and our evaluation section has a detailed listing

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

We use the Session Initiation Protocol (SIP) used in VoIP applica-tions as our example Typically a call based on SIP consists of threephases init call and end In the init phase the caller and the calleeexchange messages (eg call ID user name the connection channel)in order to set up the call session Once established the call phaseallows VoIP data to be transmitted between the caller and calleeusing the channel dened in the rst phase Finally either side canend the call as shown in the end phase

To analyze the SIP call in the midst of network trac from alltypes of protocols the protocol analyzer is required to (1) identifyall SIP trac traversing the network (2) separate the SIP tracinto dierent sessions for dierent callercallee pairs (3) grouppackets within each session into phases (init call and end) andthen perform a count of bytes only within the second (call) phase

NetQRE oers a natural programming model to implement thispolicy Conceptually the input to a NetQRE program is modeled asa stream of packets The programmer may assume that all receivedpackets have been stored and presented as the input Based onthis input NetQRE then provides programming abstractions for aprogrammer to implement various functions to process the streamThese functions can then be composed in a modular fashion toimplement the quantitative policy Using the VoIP example theprogrammer species a lter function to identify SIP trac of eachuser an iteration function that splits the SIP trac stream into asequence of VoIP sessions a split function that further breaks upeach VoIP session into the three phases and nally an aggregationfunction that sums up the total bandwidth in the call phase

These functions are specied by users in a declarative fashionand composed together in a fashion that requires users to onlythink in terms of protocol patterns over multiple packets Explicitstate maintenance to track session state is handled by the NetQREruntime Note that this programming model is a conceptual oneand the actual execution is optimized by our compiler For exampleit will incur unacceptable storage and performance overhead ifthe runtime system has to log all incoming packets and presentthem as a program input We have implemented a NetQRE compilerwhich compiles a NetQRE program into an optimized imperativeprogram The compiler automatically infers the states that needs tobe maintained for the NetQRE program and generates executioncode that implements a streaming algorithm to update the statesfor each input packet and evaluate the output in an incrementalway at runtime

22 Alternative ApproachesTo implement this policy in a low-level language a programmeroften faces the challenges such as what state needs to be maintainedand updated in order to track the progress of the SIP protocol Theprogrammer also needs to track each userrsquos media trac based onthe correct SIP state and aggregate the average VoIP usage across allusers This makes the implementation of the policy highly coupledwith the low-level implementation of the SIP protocol While indi-vidual lter split and aggregation functions can be implementedin a low-level language composing them together to implementthe correct functionality is challenging Although there are toolstoday for VoIP monitoring (eg [17]) they lack extensibility to sup-port other applications and policies For example if a new policy

requires to monitor video usage instead of VoIP one has to useother tools

Recently proposed domain-specic languages and tools cannotaddress this problem either For example trac measurement toolstypically focus on per source-destination trac measurement andcannot be used to monitor an applicationrsquos usage SDN program-ming frameworks only oer basic ow statistics based on ow-levelcounters on switches and do not have the abstraction to addressthe above mentioned challenge Event-based intrusion detectionsystems (IDS) such as Bro [33] generate high-level events at thesession level and application level from network trac Howevertheir focus is primarily on intrusion detection and hence lack lan-guage primitives that make it natural to support a wide range ofquantitative monitoring capabilities For example we observe thatit is feasible to implement an analyzer that counts the number ofVoIP calls using Bro but signicantly harder to further identifythe packets corresponding to the call phase to calculate bandwidthutilization Bro also requires the user to be an expert programmerknowledgeable with state machine models

3 THE NETQRE LANGUAGEIn NetQRE a packet is modeled as a sequence of bytes and we useparsing functions to extract information from the packet Commonparsing functions include srcip which returns the source IP of thepacket srcport (source port) syn (SYN bit) data (bytes in payload)and time (the time stamp on receipt of the packet) The parsingfunctions can be customizable by the user for example to extractapplication-level headers

Values variables and functions in NetQRE are typed NetQREoers basic types such as int bool string as well as a set of domain-specic types such as IP (IP addresses) Port (TCP and UDP ports)packet (all packets) and action The action type consists of pre-dened functions that either generate alerts or send updates toswitches NetQRE also provides high-level types such as Conn (tu-ples of source IP-port and destination IP-port) which is used forTCP and UDP packets

NetQRE oers a convenient way to write stream functions toprocess packet streams A stream function takes as input a streamof packets and produces as output values (eg monitoring resultsto an application) actions (eg alerts to the controller) or packets(eg packets ltered from an application) A stream function can bespecied as below

sfun type func_name(type var) = exp

The stream function declaration includes the keyword sfun returnedtype of the stream function followed by the name of the functionAny other arguments are specied following the function nameThe body of each stream function (right-hand-side to rsquo=rsquo) consistsof expressions which are used to specify the functionalities of thestream function

Figure 2 shows a summary of the syntax of NetQRE expressionsExpressions in stream functions are based on the theoretical founda-tions of quantitative regular expressions (QRE) [9] a novel proposalthat integrates regular expressions with numerical computationsIn the rest of this section we describe the features of NetQRE ex-pressions in stages We rst introduce how to use an extension toregular expressions to detect patterns of the input stream Second

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Predicate P = | [eld = value]| [eld = variable]| PampP | (P lsquo| |rsquo P ) | P

Regular Exp re = P | re re | re | re lsquo| |rsquo reConditional cond = exp exp | exp exp expAggregation aдд = aggop exp | type variable

aggop = sum | avg | max | minSplit split = split(exp exp aggop)Iteration iter = iter(exp aggop)Composition comp = exp gtgt expExpression exp = value | action | op exp

| exp op exp | re | cond| aдд | split | iter | comp

Figure 2 Syntax of NetQRE expressions

we discuss how to associate values and actions for the stream Fi-nally we describe a set of high-level operations that allow modularprogramming of stream functions

31 Pattern Matching over StreamsThe basic feature of NetQRE is to detect the patterns of the inputpacket stream As the basic building block NetQRE uses an exten-sion of regular expressions to which we refer as parameterizedsymbolic regular expressions (PSRE) for pattern matching over theinput stream Regular expressions (RE) are widely used for patternmatching over strings (sequences of bytes) Typically a RE usesa xed nite alphabet and the atoms in a RE are symbols in thealphabet In NetQRE a PSRE generalizes a RE in the two ways

First the atoms in a PSRE are predicates over packets instead ofsingle symbols which allows a PSRE to handle very large and po-tentially innite alphabet This property makes PSRE an appealingt for customized network ltering since the space of packets isoften very large and typically one is only interested in the valuesof some elds in a packet (eg the source and destination) insteadof the whole packet itself

Second PSRE allows the use of parameters to represent unknownvalues in the predicate With this generalization a PSRE can detecta variety of patterns using dierent instantiation of the parametersAs a result it allows the programmer to specify applications suchas counting the number of distinct IP addresses appeared in thestream which is hard if not impossible to be specied withoutparameters because no concrete predicates can be dened withoutknowing those IP addresses at runtimePredicate A basic packet predicate [f = v] checks whether thevalue in the eld f of a packet is v As an example the predicate[srcip=`1001] matches all packets whose source IP address is1001 As an example of the use of parameters [srcip=x] denesa function from the domain of x (ie IP addresses in this case) toa concrete predicate over packets We also use to denote thepredicate that matches all packets Predicates can be composedusing standard boolean combinations NetQRE also provides handymacros for widely used predicates For example is_tcp(c) is a short-hand for the predicate that matches a TCP packet in the connectionc

PSRE Like REs the basic operations for a PSRE are concatenationunion and Kleene star For example [syn=1][syn=0] matches astream of packets where the rst packet is a SYN packet followedby a sequence of non-SYN packets (including 0 non-SYN packets)Note that syntactically predicates are enclosed in square bracketsand a PSRE is enclosed in a pair of slashes in NetQRE and is theKleene star operator To illustrate the use of parameters considerthe example to count distinct source IP addresses in the streamThe core PSRE to implement this example is the function exist(x)

which checks whether a source IP x appeared in the stream Thefunction can be specied using the PSRE as below

sfun bool exist(IP x) = [ srcip=x]

We defer the discussion of the full expression for this example insect35

32 Conditional ExpressionsGiven the pattern matching capability for the input stream it isnatural to use conditional expressions to incorporate PSRE withother values and actions in order to assign costgenerate actions tothe stream

In NetQRE a conditional expression has the form exp exp1where exp is an expression that returns boolean values such asa PSRE dened above and exp1 is an expression in NetQRE Astream evaluated to true by exp will be applied to exp1 otherwisethis expression is not dened and returns undef

For example the expression 1 returns 1 if the stream matchesthe PSRE (ie the stream consists of a single packet) otherwiseit returns undef the expression size(last) returns the size ofthe last packet in the stream The keyword last denotes the lastreceived packet in the input stream As another example (countgtk)alert returns an action alert when the total number of packets in thestream is larger than k Here count is a stream function that countsthe number of packets in the stream and alert is a pre-denedaction to generate an alert event for example to a controller nodeThese actions are a basis for implementating quantitative networkpolicies using NetQRE

A conditional expression can also be specied as exp exp1 exp2which applies exp2 to the stream if exp is not satised To ensurethe consistency exp1 and exp2 should return the same type forthe stream As an example the expression [srcip=`1001]10

returns 1 if the stream contains a single packet with source IP1001 and it returns 0 if the stream contains multiple packets orthe only packet in the stream does not have the source IP

33 Stream SplitIn many network applications the input stream consists of multiplephases For example a VoIP call splits into three phases as shown inthe motivating example It is convenient to handle dierent phasesof the stream separately and combine the results of each phase ina modular way Following the operators dened in QuantitativeRegular Expressions [9] NetQRE uses the split operator to splitthe stream and compose two stream functions

A stream split expression has the form split(fgaggop) wheref and g are two expressions and aggop is an aggregation operatorsuch as sum avg (average) max and min

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

For an input stream ρ a split function splits ρ into two sub-streams ρ1 and ρ2 such that f is dened on the rst substream ρ1and g is dened on ρ2 f and g are applied respectively to the twosubstreams and the returned value is aggregated using aggop asthe return value of split(fgaggop) The split operation is a naturalquantitative generalization of the concatenation operation in regu-lar expression and Fig 3 illustrates how a split expression works

Figure 3 Illustration of split

Note that in order to return a unique value for a split expressionit is required that the splitting is unambiguous That is for allinput streams and all values of parameters in the expression thereis at most one way to split stream for f and g The property ofunambiguity can be checked eciently at compile time [9] Whenno unambiguous splitting is possible the split expression returnsundef for the input stream

As an example of split consider the example of counting thenumber of packets since the last SYN packet in the stream Naturallyone can split the stream into two substreams separated by thelast SYN packet and count the number of packets in the secondsubstream as shown in the following expression

split(any0 last_syncount sum)

Here any is the PSRE that matches any packet streams andlast_syn is the PSRE [syn=1][syn=0] that matches a stream thatstarts with a SYN packet followed by non-SYN packets Puttingtogether the split expression splits the input stream before thelast appearance of a SYN packet Note that the rst substream isassigned the value 0 while the second substream is applied to thecount function (dened later in sect34) to count the number of packets

34 Stream IterationA network application often requires to iterate over the input streamFor example counting the number of packets in the stream requiresto iterate over all single packets

Similar to the split expression NetQRE oers the iter opera-tor to iterate over the stream An iter expression is of the formiter(faggop) where f is an expression in NetQRE and aggop is anaggregation operator The iter expression splits the input streaminto multiple substreams such that f is dened on each substreamIt then iterates through all substreams and evaluates f on each sub-stream and nally aggregates the return value on each substreamusing the aggregation operator The iter operator is a quantitativegeneralization of the Kleene star operation and Figure 4 illustratesthis process Again the splitting is required to be unambiguous forall input streams to f

Figure 4 Illustration of iter

As an example the function count that counts the number ofpackets in the stream can be specied below

sfun int count = iter (1 sum)

This function splits the stream into single packets and the innerexpression returns 1 for each single packet As a result the wholeiter expression counts how many packets in the stream

35 Aggregation over ParametersAggregation is a common feature that many network monitoringapplications share For example computing the average ow sizeneeds to aggregate the average size across multiple ows NetQREoers high-level aggregation expressions to aggregate functionsover parameters

Aggregation expressions are of the form aggopf | T x wherex is a parameter with its type T and f is an expression where theparameter x appears in it Intuitively the aggregation expressiongoes through all possible values of x and evaluate f using thesubstituted value for x and nally aggregates all valid return valuesusing the aggregation operator aggop Revisiting the example ofcounting distinct source IP addresses in a stream we can use theexpression sumexist(x)10 | IP x

36 Stream CompositionTypically the input stream consists of packets from multiple sourcesand destinations Oftentimes the programmer only wants to handlea stream from a particular source for example NetQRE oers thestream composition operator gtgt that allows to preprocess the streambefore applying another stream function

Stream composition expressions are of the form f gtgt g where f

and g are two stream functions The stream composition allows theprocessing of a stream using the rst stream function f repeatedlyon every prex of the stream and the returned outputs yield a newstream that is then piped as the input stream to a second streamfunction g For example if the stream contains two packets (P1 P2)f is rst applied to P1 as a single-packet stream and then (P1 P2)The output of f on the two substreams is then piped to g

Stream composition is useful when the stream of packets needto be preprocessed in order to lter related packets for furtherprocessing For example counting the number of all packets ina TCP connection c can be specied using filter_tcp(c)gtgt countwhich rst lters all TCP packets in the connection c using thefunction filter_tcp and then counts the ltered stream using countThe function filter_tcp is dened as follows

sfun packet filter_tcp(Conn c) =

[ is_tcp(c)] last

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Filter functions are convenient and used extensively in our usecases As a short-land NetQRE uses filter(p) for the lter functionthat lters packets satisfying the predicate p For example the abovelter function can be abbreviated as filter(is_tcp(c))

Stream composition can also be used to lter packets accordingto timestamps NetQRE builds in two lter functions based ontimestamps The recent(t) function lters the stream in the recent tseconds and the every(t) function periodically lters the stream inevery t seconds For example recent(5)gtgtcount counts the numberof packets in the recent 5 seconds Since the two build-in lterfunctions are not in the core of NetQRE we only allow the useof time-based ltering outside the high-level NetQRE operationsdescribed above

4 USE CASESWe provide use cases ranging from ow-level trac measurementsto complex examples involving application-level content analysisand dynamic updates Since the high-level constructs in NetQRElanguage is based on regular expressions we can only supportqueries for regular trac patterns Nevertheless we note that thiscovers a wide range of useful queries as we will show in this sectionMore examples can be found in our technical report

41 Flow-level Trac MeasurementsWe highlight two measurement tasks that have been proposedrecently as a means to do ow scheduling and attack detectionHeavy hitter Heavy hitters [19] are ows that consume band-width larger than a threshold T A key step to identify a heavy hitteris to count the size of a ow In this example we consider a ow as asource-destination pair A natural way to specify this functionalityis to rst lter all packets from a source x to a destination y andthen count the size of the ltered stream

sfun int hh(IP x IP y) =

filter(srcip=x dstip=y) gtgt count_size

hh rst lters packets based on the source and destinations IP andsecond the ltered stream is piped into count_size which countsthe size of all packets in the stream Due to space limits we do notshow the denition of count_size which is similar to that of count

Second to alert a heavy hitter in real time to the controller wecan use the following program

sfun action alert_hh =

(hh(lastsrcip lastdstip)gtT)

alert(lastsrcip lastdstip)

The function alert_hh checks for every newly received packet (ielast in the stream) that whether its ow reaches the threshold T andissues an alert correspondingly By further applying the time-basedltering every(5)gtgtalert_hh one can detect heavy hitters in every 5secondsSuper spreader A super spreader [40] is the host that contactsmore than k distinct destinations during a time interval Like theuse case of heavy hitter detection a key step is to count how manydistinct destinations a source x contacted The following NetQREprogram does this counting

sfun int ss(IP x) =

sum exist_pair(x y)10 | IP y

The function exist_pair(xy) checks whether the source-destinationpair (xy) appeared in the stream The ss function uses an aggre-gation function sum to aggregate all possible destinations and thusgives the total number of distinct destination addresses x contacted

Similar to heavy hitter detection we can use stream compositionto lter the input stream based on time as well as trac types Forbrevity of the paper we omit the discussion of these functionalities

42 TCP State MonitoringWe next showcase applications that rely on monitoring the stateswithin a TCP owSYN ood attacks In this example NetQRE is used to detect SYNood attacks by counting the number of incomplete TCP hand-shakes in a time interval We consider an incomplete TCP hand-shake as a packet trace consisting of a SYN packet and a SYNACKpacket with corresponding sequence number and acknowledgenumber but does not have a further ACK packet to complete thehandshake As the key step the following program counts thenumber of incomplete handshakes assuming that the input streamconsists of TCP packets between the same source and destination

sfun int bad_tcp_pat(int x int y) =

concat(syn(x) synack(yx+1) no_ack(y+1))

sfun int incomplete_handshake_num =

sumbad_tcp_pat(xy)1 | int x int y

The function bad_tcp_pat species the pattern of an incompletehandshake there is a SYN packet with a sequence number x in thestream followed by a SYNACK packet with acknowledge numberx+1 and sequence number y and then followed by packets that donot include an ACK with acknowledge number y+1 to complete thehandshake The sum aggregation in incomplete_handshake_num sums upthe number of appearances of such patterns for all x and y Usingthe stream composition of ltering functions based on time andpacket type the complete function can be specied as follows

sfun int syn_flood(Conn c) =

recent (5) gtgt

filter_tcp(c) gtgt

incomplete_handshake_num gtTblock(csrcip)

Completed ows Our next example counts the number of legiti-mate ows that are completed delineated by a SYN at the beginningand ending with a FIN Note that the last iter uses the regular ex-pression which matches a stream ending with a SYN and a FINpackets and no SYN-FIN pairs appear before Therefore the iter

expression splits the entire (ltered) stream into sessions whereeach session contains exactly one complete ow

sfun int count_flow(Conn c) =

filter_tcp(c) gtgt

filter(flag=SYN || flag=FIN) gtgt

iter ([fin =1][ syn =1][ syn =1][ fin =1]1 sum)

Slowloris attacks We describe a detection of the aforementionedSlowloris attack in NetQRE based on computing the average rateof packet transferring in all legitimate TCP connections shownbelow

sfun double conn_rate = iter(tcp_patternrate sum)sfun double avg_rate =

sumfilter_tcp(c) gtgt conn_rate | Conn c

sumcount_flow(c) | Conn c

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 2: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

session for each user from input packet streams monitor usage foreach identied VoIP session and aggregating usage across all VoIPsessions grouped-by all users being tracked

Today there are several point-solutions to support certain as-pects of quantitative network monitoring However they suerfrom one or more of the following limitations First a large ma-jority of network measurement tools focus on ow-level measure-ments [15 18 19 39ndash41] that do not capture application-level orsession-level semantics This excludes a range of policies that areapplication-dependent for example tracking security signatures atthe application level or rate-limiting users based on their VoIP callusage Second these tools tend to provide ad-hoc solutions that aredicult to generalize or customize Programming frameworks forsoftware-dened networks (SDN) [21 30 32] do not support inte-gration with queries beyond basic ow-level counters and none ofthese languages support quantitative monitoring at the applicationor session level Finally tools such as Bro [33] requires networkoperators to have signicant programming expertise to implementstate machines and reason about state transitions

To address the above limitations we present NetQRE a practicaltool aimed at simplifying the specication and implementation ofquantitative network policies Our proposal is based on the observa-tion that trac patterns such as a TCP connection and a application-level session can be specied using regular expressions which arean abstraction that network operators who may not be well-versedin programming nd natural to use Regular expressions are widelyused in systems management for example in the popular grep toolfor searching for patterns in systems logs application-level packetclassication [1] signature-based attack detection [3 34] We makethe following contributionsbullNetQRE language We present the NetQRE declarative languagethat provides high-level abstractions and express a variety of quan-titative policies that span multiple packets grouped into ows andapplication-level sessions NetQRE is based on novel theoreticalfoundation of parameterized quantitative regular expressions (PQRE)PQRE extends prior work on QRE [9] which provides a formal foun-dation of combining traditional regular expressions with numericalcomputations NetQRE integrates regular-expression-like patternmatching at ow-level and application-level payloads with aggre-gation operations such as sum and average counts The languagefurther supports quantitative network policies by allowing actionson packets to be generated as output of monitoring applicationsbull Compilation and ecient runtime system We developed acompiler that can automatically generate ecient NetQRE imple-mentations with low memory footprint As part of the compilationprocess the compiler automatically infers the state that needs tobe maintained for NetQRE programs and optimizes the compiledimperative code A NetQRE runtime then executes the generatedcode eciently In fact this process also eases the burden on pro-grammers having to handwrite optimized versions in low-levelcodebullNetQRE implementation and evaluation We have developeda NetQRE prototype which we evaluate over a range of quantita-tive network monitoring tasks Our evaluation results demonstratethat NetQRE can express a wide range of quantitative networkpolicies with no more than 18 lines of code while specifying them

in imperative languages requires at least 100 lines of code Thecompiled implementations incur no more than 9 overhead inthroughput compared with optimized manually-written low-levelcode are able to handle more than 10Gbps of trac using a singlecore (and is amenable to parallelization) and are signicantly moreecient (eg 11times) compared to monitoring solutions based on Broand OpenSketch [40]

2 OVERVIEW

runtime

raw packets

Policy specification in NetQRE

NetQREcompiler Policy

implementation

Network

packets actions

updates

Figure 1 NetQRE Architecture

Fig 1 shows the architecture of NetQRE To program a quantita-tive network monitoring task a programmer simply species thismonitoring task in a declarative fashion by viewing the input as astream of packets The NetQRE compiler automatically generates ef-cient low-level imperative code that implements the specicationEach incoming raw packet from the network is rst parsed and pre-processed by the runtime into a form that can be referenced by themonitoring implementation In our deployment NetQRE relies onthe underlying runtime to handle packet reordering and retries (ifneeded) due to losses in TCP connections and also fragmentationand defragmentation of IP packets The stream of processed packetsare then sent to the appropriate NetQRE program for execution

As the NetQRE compiled program executes the incoming pack-ets are processed in a streaming fashion The output of a NetQREprogram is the generation of analysis results or actions taken torecongure the network The NetQRE tool can be deployed in dier-ent settings for example a tap on a SPAN port analyzing mirroredtrac an inline solution or running in the cloud as a virtualizedmiddlebox Our language design and compiler is agnostic to thedeployment setting While we focus on packet stream processingin this paper the NetQRE language and compilation is agnostic tothe input stream and can handle any item streams (eg raw packetsemail messages video frames)

21 Motivating ExampleAs a motivating example we consider a network management pol-icy where an enterprise monitors the usage of Voice-over-IP (VoIP)for each user and alert the user whose usage is signicantly higherthan the average usage over all users The actual language specica-tion will be described in later sections We provide a high-level intu-ition of programming language features necessary to support thispolicy Note that our examples are not solely limited to VoIP Sec-tion 4 provides more examples to showcase the wide-applicabilityof NetQRE and our evaluation section has a detailed listing

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

We use the Session Initiation Protocol (SIP) used in VoIP applica-tions as our example Typically a call based on SIP consists of threephases init call and end In the init phase the caller and the calleeexchange messages (eg call ID user name the connection channel)in order to set up the call session Once established the call phaseallows VoIP data to be transmitted between the caller and calleeusing the channel dened in the rst phase Finally either side canend the call as shown in the end phase

To analyze the SIP call in the midst of network trac from alltypes of protocols the protocol analyzer is required to (1) identifyall SIP trac traversing the network (2) separate the SIP tracinto dierent sessions for dierent callercallee pairs (3) grouppackets within each session into phases (init call and end) andthen perform a count of bytes only within the second (call) phase

NetQRE oers a natural programming model to implement thispolicy Conceptually the input to a NetQRE program is modeled asa stream of packets The programmer may assume that all receivedpackets have been stored and presented as the input Based onthis input NetQRE then provides programming abstractions for aprogrammer to implement various functions to process the streamThese functions can then be composed in a modular fashion toimplement the quantitative policy Using the VoIP example theprogrammer species a lter function to identify SIP trac of eachuser an iteration function that splits the SIP trac stream into asequence of VoIP sessions a split function that further breaks upeach VoIP session into the three phases and nally an aggregationfunction that sums up the total bandwidth in the call phase

These functions are specied by users in a declarative fashionand composed together in a fashion that requires users to onlythink in terms of protocol patterns over multiple packets Explicitstate maintenance to track session state is handled by the NetQREruntime Note that this programming model is a conceptual oneand the actual execution is optimized by our compiler For exampleit will incur unacceptable storage and performance overhead ifthe runtime system has to log all incoming packets and presentthem as a program input We have implemented a NetQRE compilerwhich compiles a NetQRE program into an optimized imperativeprogram The compiler automatically infers the states that needs tobe maintained for the NetQRE program and generates executioncode that implements a streaming algorithm to update the statesfor each input packet and evaluate the output in an incrementalway at runtime

22 Alternative ApproachesTo implement this policy in a low-level language a programmeroften faces the challenges such as what state needs to be maintainedand updated in order to track the progress of the SIP protocol Theprogrammer also needs to track each userrsquos media trac based onthe correct SIP state and aggregate the average VoIP usage across allusers This makes the implementation of the policy highly coupledwith the low-level implementation of the SIP protocol While indi-vidual lter split and aggregation functions can be implementedin a low-level language composing them together to implementthe correct functionality is challenging Although there are toolstoday for VoIP monitoring (eg [17]) they lack extensibility to sup-port other applications and policies For example if a new policy

requires to monitor video usage instead of VoIP one has to useother tools

Recently proposed domain-specic languages and tools cannotaddress this problem either For example trac measurement toolstypically focus on per source-destination trac measurement andcannot be used to monitor an applicationrsquos usage SDN program-ming frameworks only oer basic ow statistics based on ow-levelcounters on switches and do not have the abstraction to addressthe above mentioned challenge Event-based intrusion detectionsystems (IDS) such as Bro [33] generate high-level events at thesession level and application level from network trac Howevertheir focus is primarily on intrusion detection and hence lack lan-guage primitives that make it natural to support a wide range ofquantitative monitoring capabilities For example we observe thatit is feasible to implement an analyzer that counts the number ofVoIP calls using Bro but signicantly harder to further identifythe packets corresponding to the call phase to calculate bandwidthutilization Bro also requires the user to be an expert programmerknowledgeable with state machine models

3 THE NETQRE LANGUAGEIn NetQRE a packet is modeled as a sequence of bytes and we useparsing functions to extract information from the packet Commonparsing functions include srcip which returns the source IP of thepacket srcport (source port) syn (SYN bit) data (bytes in payload)and time (the time stamp on receipt of the packet) The parsingfunctions can be customizable by the user for example to extractapplication-level headers

Values variables and functions in NetQRE are typed NetQREoers basic types such as int bool string as well as a set of domain-specic types such as IP (IP addresses) Port (TCP and UDP ports)packet (all packets) and action The action type consists of pre-dened functions that either generate alerts or send updates toswitches NetQRE also provides high-level types such as Conn (tu-ples of source IP-port and destination IP-port) which is used forTCP and UDP packets

NetQRE oers a convenient way to write stream functions toprocess packet streams A stream function takes as input a streamof packets and produces as output values (eg monitoring resultsto an application) actions (eg alerts to the controller) or packets(eg packets ltered from an application) A stream function can bespecied as below

sfun type func_name(type var) = exp

The stream function declaration includes the keyword sfun returnedtype of the stream function followed by the name of the functionAny other arguments are specied following the function nameThe body of each stream function (right-hand-side to rsquo=rsquo) consistsof expressions which are used to specify the functionalities of thestream function

Figure 2 shows a summary of the syntax of NetQRE expressionsExpressions in stream functions are based on the theoretical founda-tions of quantitative regular expressions (QRE) [9] a novel proposalthat integrates regular expressions with numerical computationsIn the rest of this section we describe the features of NetQRE ex-pressions in stages We rst introduce how to use an extension toregular expressions to detect patterns of the input stream Second

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Predicate P = | [eld = value]| [eld = variable]| PampP | (P lsquo| |rsquo P ) | P

Regular Exp re = P | re re | re | re lsquo| |rsquo reConditional cond = exp exp | exp exp expAggregation aдд = aggop exp | type variable

aggop = sum | avg | max | minSplit split = split(exp exp aggop)Iteration iter = iter(exp aggop)Composition comp = exp gtgt expExpression exp = value | action | op exp

| exp op exp | re | cond| aдд | split | iter | comp

Figure 2 Syntax of NetQRE expressions

we discuss how to associate values and actions for the stream Fi-nally we describe a set of high-level operations that allow modularprogramming of stream functions

31 Pattern Matching over StreamsThe basic feature of NetQRE is to detect the patterns of the inputpacket stream As the basic building block NetQRE uses an exten-sion of regular expressions to which we refer as parameterizedsymbolic regular expressions (PSRE) for pattern matching over theinput stream Regular expressions (RE) are widely used for patternmatching over strings (sequences of bytes) Typically a RE usesa xed nite alphabet and the atoms in a RE are symbols in thealphabet In NetQRE a PSRE generalizes a RE in the two ways

First the atoms in a PSRE are predicates over packets instead ofsingle symbols which allows a PSRE to handle very large and po-tentially innite alphabet This property makes PSRE an appealingt for customized network ltering since the space of packets isoften very large and typically one is only interested in the valuesof some elds in a packet (eg the source and destination) insteadof the whole packet itself

Second PSRE allows the use of parameters to represent unknownvalues in the predicate With this generalization a PSRE can detecta variety of patterns using dierent instantiation of the parametersAs a result it allows the programmer to specify applications suchas counting the number of distinct IP addresses appeared in thestream which is hard if not impossible to be specied withoutparameters because no concrete predicates can be dened withoutknowing those IP addresses at runtimePredicate A basic packet predicate [f = v] checks whether thevalue in the eld f of a packet is v As an example the predicate[srcip=`1001] matches all packets whose source IP address is1001 As an example of the use of parameters [srcip=x] denesa function from the domain of x (ie IP addresses in this case) toa concrete predicate over packets We also use to denote thepredicate that matches all packets Predicates can be composedusing standard boolean combinations NetQRE also provides handymacros for widely used predicates For example is_tcp(c) is a short-hand for the predicate that matches a TCP packet in the connectionc

PSRE Like REs the basic operations for a PSRE are concatenationunion and Kleene star For example [syn=1][syn=0] matches astream of packets where the rst packet is a SYN packet followedby a sequence of non-SYN packets (including 0 non-SYN packets)Note that syntactically predicates are enclosed in square bracketsand a PSRE is enclosed in a pair of slashes in NetQRE and is theKleene star operator To illustrate the use of parameters considerthe example to count distinct source IP addresses in the streamThe core PSRE to implement this example is the function exist(x)

which checks whether a source IP x appeared in the stream Thefunction can be specied using the PSRE as below

sfun bool exist(IP x) = [ srcip=x]

We defer the discussion of the full expression for this example insect35

32 Conditional ExpressionsGiven the pattern matching capability for the input stream it isnatural to use conditional expressions to incorporate PSRE withother values and actions in order to assign costgenerate actions tothe stream

In NetQRE a conditional expression has the form exp exp1where exp is an expression that returns boolean values such asa PSRE dened above and exp1 is an expression in NetQRE Astream evaluated to true by exp will be applied to exp1 otherwisethis expression is not dened and returns undef

For example the expression 1 returns 1 if the stream matchesthe PSRE (ie the stream consists of a single packet) otherwiseit returns undef the expression size(last) returns the size ofthe last packet in the stream The keyword last denotes the lastreceived packet in the input stream As another example (countgtk)alert returns an action alert when the total number of packets in thestream is larger than k Here count is a stream function that countsthe number of packets in the stream and alert is a pre-denedaction to generate an alert event for example to a controller nodeThese actions are a basis for implementating quantitative networkpolicies using NetQRE

A conditional expression can also be specied as exp exp1 exp2which applies exp2 to the stream if exp is not satised To ensurethe consistency exp1 and exp2 should return the same type forthe stream As an example the expression [srcip=`1001]10

returns 1 if the stream contains a single packet with source IP1001 and it returns 0 if the stream contains multiple packets orthe only packet in the stream does not have the source IP

33 Stream SplitIn many network applications the input stream consists of multiplephases For example a VoIP call splits into three phases as shown inthe motivating example It is convenient to handle dierent phasesof the stream separately and combine the results of each phase ina modular way Following the operators dened in QuantitativeRegular Expressions [9] NetQRE uses the split operator to splitthe stream and compose two stream functions

A stream split expression has the form split(fgaggop) wheref and g are two expressions and aggop is an aggregation operatorsuch as sum avg (average) max and min

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

For an input stream ρ a split function splits ρ into two sub-streams ρ1 and ρ2 such that f is dened on the rst substream ρ1and g is dened on ρ2 f and g are applied respectively to the twosubstreams and the returned value is aggregated using aggop asthe return value of split(fgaggop) The split operation is a naturalquantitative generalization of the concatenation operation in regu-lar expression and Fig 3 illustrates how a split expression works

Figure 3 Illustration of split

Note that in order to return a unique value for a split expressionit is required that the splitting is unambiguous That is for allinput streams and all values of parameters in the expression thereis at most one way to split stream for f and g The property ofunambiguity can be checked eciently at compile time [9] Whenno unambiguous splitting is possible the split expression returnsundef for the input stream

As an example of split consider the example of counting thenumber of packets since the last SYN packet in the stream Naturallyone can split the stream into two substreams separated by thelast SYN packet and count the number of packets in the secondsubstream as shown in the following expression

split(any0 last_syncount sum)

Here any is the PSRE that matches any packet streams andlast_syn is the PSRE [syn=1][syn=0] that matches a stream thatstarts with a SYN packet followed by non-SYN packets Puttingtogether the split expression splits the input stream before thelast appearance of a SYN packet Note that the rst substream isassigned the value 0 while the second substream is applied to thecount function (dened later in sect34) to count the number of packets

34 Stream IterationA network application often requires to iterate over the input streamFor example counting the number of packets in the stream requiresto iterate over all single packets

Similar to the split expression NetQRE oers the iter opera-tor to iterate over the stream An iter expression is of the formiter(faggop) where f is an expression in NetQRE and aggop is anaggregation operator The iter expression splits the input streaminto multiple substreams such that f is dened on each substreamIt then iterates through all substreams and evaluates f on each sub-stream and nally aggregates the return value on each substreamusing the aggregation operator The iter operator is a quantitativegeneralization of the Kleene star operation and Figure 4 illustratesthis process Again the splitting is required to be unambiguous forall input streams to f

Figure 4 Illustration of iter

As an example the function count that counts the number ofpackets in the stream can be specied below

sfun int count = iter (1 sum)

This function splits the stream into single packets and the innerexpression returns 1 for each single packet As a result the wholeiter expression counts how many packets in the stream

35 Aggregation over ParametersAggregation is a common feature that many network monitoringapplications share For example computing the average ow sizeneeds to aggregate the average size across multiple ows NetQREoers high-level aggregation expressions to aggregate functionsover parameters

Aggregation expressions are of the form aggopf | T x wherex is a parameter with its type T and f is an expression where theparameter x appears in it Intuitively the aggregation expressiongoes through all possible values of x and evaluate f using thesubstituted value for x and nally aggregates all valid return valuesusing the aggregation operator aggop Revisiting the example ofcounting distinct source IP addresses in a stream we can use theexpression sumexist(x)10 | IP x

36 Stream CompositionTypically the input stream consists of packets from multiple sourcesand destinations Oftentimes the programmer only wants to handlea stream from a particular source for example NetQRE oers thestream composition operator gtgt that allows to preprocess the streambefore applying another stream function

Stream composition expressions are of the form f gtgt g where f

and g are two stream functions The stream composition allows theprocessing of a stream using the rst stream function f repeatedlyon every prex of the stream and the returned outputs yield a newstream that is then piped as the input stream to a second streamfunction g For example if the stream contains two packets (P1 P2)f is rst applied to P1 as a single-packet stream and then (P1 P2)The output of f on the two substreams is then piped to g

Stream composition is useful when the stream of packets needto be preprocessed in order to lter related packets for furtherprocessing For example counting the number of all packets ina TCP connection c can be specied using filter_tcp(c)gtgt countwhich rst lters all TCP packets in the connection c using thefunction filter_tcp and then counts the ltered stream using countThe function filter_tcp is dened as follows

sfun packet filter_tcp(Conn c) =

[ is_tcp(c)] last

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Filter functions are convenient and used extensively in our usecases As a short-land NetQRE uses filter(p) for the lter functionthat lters packets satisfying the predicate p For example the abovelter function can be abbreviated as filter(is_tcp(c))

Stream composition can also be used to lter packets accordingto timestamps NetQRE builds in two lter functions based ontimestamps The recent(t) function lters the stream in the recent tseconds and the every(t) function periodically lters the stream inevery t seconds For example recent(5)gtgtcount counts the numberof packets in the recent 5 seconds Since the two build-in lterfunctions are not in the core of NetQRE we only allow the useof time-based ltering outside the high-level NetQRE operationsdescribed above

4 USE CASESWe provide use cases ranging from ow-level trac measurementsto complex examples involving application-level content analysisand dynamic updates Since the high-level constructs in NetQRElanguage is based on regular expressions we can only supportqueries for regular trac patterns Nevertheless we note that thiscovers a wide range of useful queries as we will show in this sectionMore examples can be found in our technical report

41 Flow-level Trac MeasurementsWe highlight two measurement tasks that have been proposedrecently as a means to do ow scheduling and attack detectionHeavy hitter Heavy hitters [19] are ows that consume band-width larger than a threshold T A key step to identify a heavy hitteris to count the size of a ow In this example we consider a ow as asource-destination pair A natural way to specify this functionalityis to rst lter all packets from a source x to a destination y andthen count the size of the ltered stream

sfun int hh(IP x IP y) =

filter(srcip=x dstip=y) gtgt count_size

hh rst lters packets based on the source and destinations IP andsecond the ltered stream is piped into count_size which countsthe size of all packets in the stream Due to space limits we do notshow the denition of count_size which is similar to that of count

Second to alert a heavy hitter in real time to the controller wecan use the following program

sfun action alert_hh =

(hh(lastsrcip lastdstip)gtT)

alert(lastsrcip lastdstip)

The function alert_hh checks for every newly received packet (ielast in the stream) that whether its ow reaches the threshold T andissues an alert correspondingly By further applying the time-basedltering every(5)gtgtalert_hh one can detect heavy hitters in every 5secondsSuper spreader A super spreader [40] is the host that contactsmore than k distinct destinations during a time interval Like theuse case of heavy hitter detection a key step is to count how manydistinct destinations a source x contacted The following NetQREprogram does this counting

sfun int ss(IP x) =

sum exist_pair(x y)10 | IP y

The function exist_pair(xy) checks whether the source-destinationpair (xy) appeared in the stream The ss function uses an aggre-gation function sum to aggregate all possible destinations and thusgives the total number of distinct destination addresses x contacted

Similar to heavy hitter detection we can use stream compositionto lter the input stream based on time as well as trac types Forbrevity of the paper we omit the discussion of these functionalities

42 TCP State MonitoringWe next showcase applications that rely on monitoring the stateswithin a TCP owSYN ood attacks In this example NetQRE is used to detect SYNood attacks by counting the number of incomplete TCP hand-shakes in a time interval We consider an incomplete TCP hand-shake as a packet trace consisting of a SYN packet and a SYNACKpacket with corresponding sequence number and acknowledgenumber but does not have a further ACK packet to complete thehandshake As the key step the following program counts thenumber of incomplete handshakes assuming that the input streamconsists of TCP packets between the same source and destination

sfun int bad_tcp_pat(int x int y) =

concat(syn(x) synack(yx+1) no_ack(y+1))

sfun int incomplete_handshake_num =

sumbad_tcp_pat(xy)1 | int x int y

The function bad_tcp_pat species the pattern of an incompletehandshake there is a SYN packet with a sequence number x in thestream followed by a SYNACK packet with acknowledge numberx+1 and sequence number y and then followed by packets that donot include an ACK with acknowledge number y+1 to complete thehandshake The sum aggregation in incomplete_handshake_num sums upthe number of appearances of such patterns for all x and y Usingthe stream composition of ltering functions based on time andpacket type the complete function can be specied as follows

sfun int syn_flood(Conn c) =

recent (5) gtgt

filter_tcp(c) gtgt

incomplete_handshake_num gtTblock(csrcip)

Completed ows Our next example counts the number of legiti-mate ows that are completed delineated by a SYN at the beginningand ending with a FIN Note that the last iter uses the regular ex-pression which matches a stream ending with a SYN and a FINpackets and no SYN-FIN pairs appear before Therefore the iter

expression splits the entire (ltered) stream into sessions whereeach session contains exactly one complete ow

sfun int count_flow(Conn c) =

filter_tcp(c) gtgt

filter(flag=SYN || flag=FIN) gtgt

iter ([fin =1][ syn =1][ syn =1][ fin =1]1 sum)

Slowloris attacks We describe a detection of the aforementionedSlowloris attack in NetQRE based on computing the average rateof packet transferring in all legitimate TCP connections shownbelow

sfun double conn_rate = iter(tcp_patternrate sum)sfun double avg_rate =

sumfilter_tcp(c) gtgt conn_rate | Conn c

sumcount_flow(c) | Conn c

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 3: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

We use the Session Initiation Protocol (SIP) used in VoIP applica-tions as our example Typically a call based on SIP consists of threephases init call and end In the init phase the caller and the calleeexchange messages (eg call ID user name the connection channel)in order to set up the call session Once established the call phaseallows VoIP data to be transmitted between the caller and calleeusing the channel dened in the rst phase Finally either side canend the call as shown in the end phase

To analyze the SIP call in the midst of network trac from alltypes of protocols the protocol analyzer is required to (1) identifyall SIP trac traversing the network (2) separate the SIP tracinto dierent sessions for dierent callercallee pairs (3) grouppackets within each session into phases (init call and end) andthen perform a count of bytes only within the second (call) phase

NetQRE oers a natural programming model to implement thispolicy Conceptually the input to a NetQRE program is modeled asa stream of packets The programmer may assume that all receivedpackets have been stored and presented as the input Based onthis input NetQRE then provides programming abstractions for aprogrammer to implement various functions to process the streamThese functions can then be composed in a modular fashion toimplement the quantitative policy Using the VoIP example theprogrammer species a lter function to identify SIP trac of eachuser an iteration function that splits the SIP trac stream into asequence of VoIP sessions a split function that further breaks upeach VoIP session into the three phases and nally an aggregationfunction that sums up the total bandwidth in the call phase

These functions are specied by users in a declarative fashionand composed together in a fashion that requires users to onlythink in terms of protocol patterns over multiple packets Explicitstate maintenance to track session state is handled by the NetQREruntime Note that this programming model is a conceptual oneand the actual execution is optimized by our compiler For exampleit will incur unacceptable storage and performance overhead ifthe runtime system has to log all incoming packets and presentthem as a program input We have implemented a NetQRE compilerwhich compiles a NetQRE program into an optimized imperativeprogram The compiler automatically infers the states that needs tobe maintained for the NetQRE program and generates executioncode that implements a streaming algorithm to update the statesfor each input packet and evaluate the output in an incrementalway at runtime

22 Alternative ApproachesTo implement this policy in a low-level language a programmeroften faces the challenges such as what state needs to be maintainedand updated in order to track the progress of the SIP protocol Theprogrammer also needs to track each userrsquos media trac based onthe correct SIP state and aggregate the average VoIP usage across allusers This makes the implementation of the policy highly coupledwith the low-level implementation of the SIP protocol While indi-vidual lter split and aggregation functions can be implementedin a low-level language composing them together to implementthe correct functionality is challenging Although there are toolstoday for VoIP monitoring (eg [17]) they lack extensibility to sup-port other applications and policies For example if a new policy

requires to monitor video usage instead of VoIP one has to useother tools

Recently proposed domain-specic languages and tools cannotaddress this problem either For example trac measurement toolstypically focus on per source-destination trac measurement andcannot be used to monitor an applicationrsquos usage SDN program-ming frameworks only oer basic ow statistics based on ow-levelcounters on switches and do not have the abstraction to addressthe above mentioned challenge Event-based intrusion detectionsystems (IDS) such as Bro [33] generate high-level events at thesession level and application level from network trac Howevertheir focus is primarily on intrusion detection and hence lack lan-guage primitives that make it natural to support a wide range ofquantitative monitoring capabilities For example we observe thatit is feasible to implement an analyzer that counts the number ofVoIP calls using Bro but signicantly harder to further identifythe packets corresponding to the call phase to calculate bandwidthutilization Bro also requires the user to be an expert programmerknowledgeable with state machine models

3 THE NETQRE LANGUAGEIn NetQRE a packet is modeled as a sequence of bytes and we useparsing functions to extract information from the packet Commonparsing functions include srcip which returns the source IP of thepacket srcport (source port) syn (SYN bit) data (bytes in payload)and time (the time stamp on receipt of the packet) The parsingfunctions can be customizable by the user for example to extractapplication-level headers

Values variables and functions in NetQRE are typed NetQREoers basic types such as int bool string as well as a set of domain-specic types such as IP (IP addresses) Port (TCP and UDP ports)packet (all packets) and action The action type consists of pre-dened functions that either generate alerts or send updates toswitches NetQRE also provides high-level types such as Conn (tu-ples of source IP-port and destination IP-port) which is used forTCP and UDP packets

NetQRE oers a convenient way to write stream functions toprocess packet streams A stream function takes as input a streamof packets and produces as output values (eg monitoring resultsto an application) actions (eg alerts to the controller) or packets(eg packets ltered from an application) A stream function can bespecied as below

sfun type func_name(type var) = exp

The stream function declaration includes the keyword sfun returnedtype of the stream function followed by the name of the functionAny other arguments are specied following the function nameThe body of each stream function (right-hand-side to rsquo=rsquo) consistsof expressions which are used to specify the functionalities of thestream function

Figure 2 shows a summary of the syntax of NetQRE expressionsExpressions in stream functions are based on the theoretical founda-tions of quantitative regular expressions (QRE) [9] a novel proposalthat integrates regular expressions with numerical computationsIn the rest of this section we describe the features of NetQRE ex-pressions in stages We rst introduce how to use an extension toregular expressions to detect patterns of the input stream Second

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Predicate P = | [eld = value]| [eld = variable]| PampP | (P lsquo| |rsquo P ) | P

Regular Exp re = P | re re | re | re lsquo| |rsquo reConditional cond = exp exp | exp exp expAggregation aдд = aggop exp | type variable

aggop = sum | avg | max | minSplit split = split(exp exp aggop)Iteration iter = iter(exp aggop)Composition comp = exp gtgt expExpression exp = value | action | op exp

| exp op exp | re | cond| aдд | split | iter | comp

Figure 2 Syntax of NetQRE expressions

we discuss how to associate values and actions for the stream Fi-nally we describe a set of high-level operations that allow modularprogramming of stream functions

31 Pattern Matching over StreamsThe basic feature of NetQRE is to detect the patterns of the inputpacket stream As the basic building block NetQRE uses an exten-sion of regular expressions to which we refer as parameterizedsymbolic regular expressions (PSRE) for pattern matching over theinput stream Regular expressions (RE) are widely used for patternmatching over strings (sequences of bytes) Typically a RE usesa xed nite alphabet and the atoms in a RE are symbols in thealphabet In NetQRE a PSRE generalizes a RE in the two ways

First the atoms in a PSRE are predicates over packets instead ofsingle symbols which allows a PSRE to handle very large and po-tentially innite alphabet This property makes PSRE an appealingt for customized network ltering since the space of packets isoften very large and typically one is only interested in the valuesof some elds in a packet (eg the source and destination) insteadof the whole packet itself

Second PSRE allows the use of parameters to represent unknownvalues in the predicate With this generalization a PSRE can detecta variety of patterns using dierent instantiation of the parametersAs a result it allows the programmer to specify applications suchas counting the number of distinct IP addresses appeared in thestream which is hard if not impossible to be specied withoutparameters because no concrete predicates can be dened withoutknowing those IP addresses at runtimePredicate A basic packet predicate [f = v] checks whether thevalue in the eld f of a packet is v As an example the predicate[srcip=`1001] matches all packets whose source IP address is1001 As an example of the use of parameters [srcip=x] denesa function from the domain of x (ie IP addresses in this case) toa concrete predicate over packets We also use to denote thepredicate that matches all packets Predicates can be composedusing standard boolean combinations NetQRE also provides handymacros for widely used predicates For example is_tcp(c) is a short-hand for the predicate that matches a TCP packet in the connectionc

PSRE Like REs the basic operations for a PSRE are concatenationunion and Kleene star For example [syn=1][syn=0] matches astream of packets where the rst packet is a SYN packet followedby a sequence of non-SYN packets (including 0 non-SYN packets)Note that syntactically predicates are enclosed in square bracketsand a PSRE is enclosed in a pair of slashes in NetQRE and is theKleene star operator To illustrate the use of parameters considerthe example to count distinct source IP addresses in the streamThe core PSRE to implement this example is the function exist(x)

which checks whether a source IP x appeared in the stream Thefunction can be specied using the PSRE as below

sfun bool exist(IP x) = [ srcip=x]

We defer the discussion of the full expression for this example insect35

32 Conditional ExpressionsGiven the pattern matching capability for the input stream it isnatural to use conditional expressions to incorporate PSRE withother values and actions in order to assign costgenerate actions tothe stream

In NetQRE a conditional expression has the form exp exp1where exp is an expression that returns boolean values such asa PSRE dened above and exp1 is an expression in NetQRE Astream evaluated to true by exp will be applied to exp1 otherwisethis expression is not dened and returns undef

For example the expression 1 returns 1 if the stream matchesthe PSRE (ie the stream consists of a single packet) otherwiseit returns undef the expression size(last) returns the size ofthe last packet in the stream The keyword last denotes the lastreceived packet in the input stream As another example (countgtk)alert returns an action alert when the total number of packets in thestream is larger than k Here count is a stream function that countsthe number of packets in the stream and alert is a pre-denedaction to generate an alert event for example to a controller nodeThese actions are a basis for implementating quantitative networkpolicies using NetQRE

A conditional expression can also be specied as exp exp1 exp2which applies exp2 to the stream if exp is not satised To ensurethe consistency exp1 and exp2 should return the same type forthe stream As an example the expression [srcip=`1001]10

returns 1 if the stream contains a single packet with source IP1001 and it returns 0 if the stream contains multiple packets orthe only packet in the stream does not have the source IP

33 Stream SplitIn many network applications the input stream consists of multiplephases For example a VoIP call splits into three phases as shown inthe motivating example It is convenient to handle dierent phasesof the stream separately and combine the results of each phase ina modular way Following the operators dened in QuantitativeRegular Expressions [9] NetQRE uses the split operator to splitthe stream and compose two stream functions

A stream split expression has the form split(fgaggop) wheref and g are two expressions and aggop is an aggregation operatorsuch as sum avg (average) max and min

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

For an input stream ρ a split function splits ρ into two sub-streams ρ1 and ρ2 such that f is dened on the rst substream ρ1and g is dened on ρ2 f and g are applied respectively to the twosubstreams and the returned value is aggregated using aggop asthe return value of split(fgaggop) The split operation is a naturalquantitative generalization of the concatenation operation in regu-lar expression and Fig 3 illustrates how a split expression works

Figure 3 Illustration of split

Note that in order to return a unique value for a split expressionit is required that the splitting is unambiguous That is for allinput streams and all values of parameters in the expression thereis at most one way to split stream for f and g The property ofunambiguity can be checked eciently at compile time [9] Whenno unambiguous splitting is possible the split expression returnsundef for the input stream

As an example of split consider the example of counting thenumber of packets since the last SYN packet in the stream Naturallyone can split the stream into two substreams separated by thelast SYN packet and count the number of packets in the secondsubstream as shown in the following expression

split(any0 last_syncount sum)

Here any is the PSRE that matches any packet streams andlast_syn is the PSRE [syn=1][syn=0] that matches a stream thatstarts with a SYN packet followed by non-SYN packets Puttingtogether the split expression splits the input stream before thelast appearance of a SYN packet Note that the rst substream isassigned the value 0 while the second substream is applied to thecount function (dened later in sect34) to count the number of packets

34 Stream IterationA network application often requires to iterate over the input streamFor example counting the number of packets in the stream requiresto iterate over all single packets

Similar to the split expression NetQRE oers the iter opera-tor to iterate over the stream An iter expression is of the formiter(faggop) where f is an expression in NetQRE and aggop is anaggregation operator The iter expression splits the input streaminto multiple substreams such that f is dened on each substreamIt then iterates through all substreams and evaluates f on each sub-stream and nally aggregates the return value on each substreamusing the aggregation operator The iter operator is a quantitativegeneralization of the Kleene star operation and Figure 4 illustratesthis process Again the splitting is required to be unambiguous forall input streams to f

Figure 4 Illustration of iter

As an example the function count that counts the number ofpackets in the stream can be specied below

sfun int count = iter (1 sum)

This function splits the stream into single packets and the innerexpression returns 1 for each single packet As a result the wholeiter expression counts how many packets in the stream

35 Aggregation over ParametersAggregation is a common feature that many network monitoringapplications share For example computing the average ow sizeneeds to aggregate the average size across multiple ows NetQREoers high-level aggregation expressions to aggregate functionsover parameters

Aggregation expressions are of the form aggopf | T x wherex is a parameter with its type T and f is an expression where theparameter x appears in it Intuitively the aggregation expressiongoes through all possible values of x and evaluate f using thesubstituted value for x and nally aggregates all valid return valuesusing the aggregation operator aggop Revisiting the example ofcounting distinct source IP addresses in a stream we can use theexpression sumexist(x)10 | IP x

36 Stream CompositionTypically the input stream consists of packets from multiple sourcesand destinations Oftentimes the programmer only wants to handlea stream from a particular source for example NetQRE oers thestream composition operator gtgt that allows to preprocess the streambefore applying another stream function

Stream composition expressions are of the form f gtgt g where f

and g are two stream functions The stream composition allows theprocessing of a stream using the rst stream function f repeatedlyon every prex of the stream and the returned outputs yield a newstream that is then piped as the input stream to a second streamfunction g For example if the stream contains two packets (P1 P2)f is rst applied to P1 as a single-packet stream and then (P1 P2)The output of f on the two substreams is then piped to g

Stream composition is useful when the stream of packets needto be preprocessed in order to lter related packets for furtherprocessing For example counting the number of all packets ina TCP connection c can be specied using filter_tcp(c)gtgt countwhich rst lters all TCP packets in the connection c using thefunction filter_tcp and then counts the ltered stream using countThe function filter_tcp is dened as follows

sfun packet filter_tcp(Conn c) =

[ is_tcp(c)] last

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Filter functions are convenient and used extensively in our usecases As a short-land NetQRE uses filter(p) for the lter functionthat lters packets satisfying the predicate p For example the abovelter function can be abbreviated as filter(is_tcp(c))

Stream composition can also be used to lter packets accordingto timestamps NetQRE builds in two lter functions based ontimestamps The recent(t) function lters the stream in the recent tseconds and the every(t) function periodically lters the stream inevery t seconds For example recent(5)gtgtcount counts the numberof packets in the recent 5 seconds Since the two build-in lterfunctions are not in the core of NetQRE we only allow the useof time-based ltering outside the high-level NetQRE operationsdescribed above

4 USE CASESWe provide use cases ranging from ow-level trac measurementsto complex examples involving application-level content analysisand dynamic updates Since the high-level constructs in NetQRElanguage is based on regular expressions we can only supportqueries for regular trac patterns Nevertheless we note that thiscovers a wide range of useful queries as we will show in this sectionMore examples can be found in our technical report

41 Flow-level Trac MeasurementsWe highlight two measurement tasks that have been proposedrecently as a means to do ow scheduling and attack detectionHeavy hitter Heavy hitters [19] are ows that consume band-width larger than a threshold T A key step to identify a heavy hitteris to count the size of a ow In this example we consider a ow as asource-destination pair A natural way to specify this functionalityis to rst lter all packets from a source x to a destination y andthen count the size of the ltered stream

sfun int hh(IP x IP y) =

filter(srcip=x dstip=y) gtgt count_size

hh rst lters packets based on the source and destinations IP andsecond the ltered stream is piped into count_size which countsthe size of all packets in the stream Due to space limits we do notshow the denition of count_size which is similar to that of count

Second to alert a heavy hitter in real time to the controller wecan use the following program

sfun action alert_hh =

(hh(lastsrcip lastdstip)gtT)

alert(lastsrcip lastdstip)

The function alert_hh checks for every newly received packet (ielast in the stream) that whether its ow reaches the threshold T andissues an alert correspondingly By further applying the time-basedltering every(5)gtgtalert_hh one can detect heavy hitters in every 5secondsSuper spreader A super spreader [40] is the host that contactsmore than k distinct destinations during a time interval Like theuse case of heavy hitter detection a key step is to count how manydistinct destinations a source x contacted The following NetQREprogram does this counting

sfun int ss(IP x) =

sum exist_pair(x y)10 | IP y

The function exist_pair(xy) checks whether the source-destinationpair (xy) appeared in the stream The ss function uses an aggre-gation function sum to aggregate all possible destinations and thusgives the total number of distinct destination addresses x contacted

Similar to heavy hitter detection we can use stream compositionto lter the input stream based on time as well as trac types Forbrevity of the paper we omit the discussion of these functionalities

42 TCP State MonitoringWe next showcase applications that rely on monitoring the stateswithin a TCP owSYN ood attacks In this example NetQRE is used to detect SYNood attacks by counting the number of incomplete TCP hand-shakes in a time interval We consider an incomplete TCP hand-shake as a packet trace consisting of a SYN packet and a SYNACKpacket with corresponding sequence number and acknowledgenumber but does not have a further ACK packet to complete thehandshake As the key step the following program counts thenumber of incomplete handshakes assuming that the input streamconsists of TCP packets between the same source and destination

sfun int bad_tcp_pat(int x int y) =

concat(syn(x) synack(yx+1) no_ack(y+1))

sfun int incomplete_handshake_num =

sumbad_tcp_pat(xy)1 | int x int y

The function bad_tcp_pat species the pattern of an incompletehandshake there is a SYN packet with a sequence number x in thestream followed by a SYNACK packet with acknowledge numberx+1 and sequence number y and then followed by packets that donot include an ACK with acknowledge number y+1 to complete thehandshake The sum aggregation in incomplete_handshake_num sums upthe number of appearances of such patterns for all x and y Usingthe stream composition of ltering functions based on time andpacket type the complete function can be specied as follows

sfun int syn_flood(Conn c) =

recent (5) gtgt

filter_tcp(c) gtgt

incomplete_handshake_num gtTblock(csrcip)

Completed ows Our next example counts the number of legiti-mate ows that are completed delineated by a SYN at the beginningand ending with a FIN Note that the last iter uses the regular ex-pression which matches a stream ending with a SYN and a FINpackets and no SYN-FIN pairs appear before Therefore the iter

expression splits the entire (ltered) stream into sessions whereeach session contains exactly one complete ow

sfun int count_flow(Conn c) =

filter_tcp(c) gtgt

filter(flag=SYN || flag=FIN) gtgt

iter ([fin =1][ syn =1][ syn =1][ fin =1]1 sum)

Slowloris attacks We describe a detection of the aforementionedSlowloris attack in NetQRE based on computing the average rateof packet transferring in all legitimate TCP connections shownbelow

sfun double conn_rate = iter(tcp_patternrate sum)sfun double avg_rate =

sumfilter_tcp(c) gtgt conn_rate | Conn c

sumcount_flow(c) | Conn c

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 4: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Predicate P = | [eld = value]| [eld = variable]| PampP | (P lsquo| |rsquo P ) | P

Regular Exp re = P | re re | re | re lsquo| |rsquo reConditional cond = exp exp | exp exp expAggregation aдд = aggop exp | type variable

aggop = sum | avg | max | minSplit split = split(exp exp aggop)Iteration iter = iter(exp aggop)Composition comp = exp gtgt expExpression exp = value | action | op exp

| exp op exp | re | cond| aдд | split | iter | comp

Figure 2 Syntax of NetQRE expressions

we discuss how to associate values and actions for the stream Fi-nally we describe a set of high-level operations that allow modularprogramming of stream functions

31 Pattern Matching over StreamsThe basic feature of NetQRE is to detect the patterns of the inputpacket stream As the basic building block NetQRE uses an exten-sion of regular expressions to which we refer as parameterizedsymbolic regular expressions (PSRE) for pattern matching over theinput stream Regular expressions (RE) are widely used for patternmatching over strings (sequences of bytes) Typically a RE usesa xed nite alphabet and the atoms in a RE are symbols in thealphabet In NetQRE a PSRE generalizes a RE in the two ways

First the atoms in a PSRE are predicates over packets instead ofsingle symbols which allows a PSRE to handle very large and po-tentially innite alphabet This property makes PSRE an appealingt for customized network ltering since the space of packets isoften very large and typically one is only interested in the valuesof some elds in a packet (eg the source and destination) insteadof the whole packet itself

Second PSRE allows the use of parameters to represent unknownvalues in the predicate With this generalization a PSRE can detecta variety of patterns using dierent instantiation of the parametersAs a result it allows the programmer to specify applications suchas counting the number of distinct IP addresses appeared in thestream which is hard if not impossible to be specied withoutparameters because no concrete predicates can be dened withoutknowing those IP addresses at runtimePredicate A basic packet predicate [f = v] checks whether thevalue in the eld f of a packet is v As an example the predicate[srcip=`1001] matches all packets whose source IP address is1001 As an example of the use of parameters [srcip=x] denesa function from the domain of x (ie IP addresses in this case) toa concrete predicate over packets We also use to denote thepredicate that matches all packets Predicates can be composedusing standard boolean combinations NetQRE also provides handymacros for widely used predicates For example is_tcp(c) is a short-hand for the predicate that matches a TCP packet in the connectionc

PSRE Like REs the basic operations for a PSRE are concatenationunion and Kleene star For example [syn=1][syn=0] matches astream of packets where the rst packet is a SYN packet followedby a sequence of non-SYN packets (including 0 non-SYN packets)Note that syntactically predicates are enclosed in square bracketsand a PSRE is enclosed in a pair of slashes in NetQRE and is theKleene star operator To illustrate the use of parameters considerthe example to count distinct source IP addresses in the streamThe core PSRE to implement this example is the function exist(x)

which checks whether a source IP x appeared in the stream Thefunction can be specied using the PSRE as below

sfun bool exist(IP x) = [ srcip=x]

We defer the discussion of the full expression for this example insect35

32 Conditional ExpressionsGiven the pattern matching capability for the input stream it isnatural to use conditional expressions to incorporate PSRE withother values and actions in order to assign costgenerate actions tothe stream

In NetQRE a conditional expression has the form exp exp1where exp is an expression that returns boolean values such asa PSRE dened above and exp1 is an expression in NetQRE Astream evaluated to true by exp will be applied to exp1 otherwisethis expression is not dened and returns undef

For example the expression 1 returns 1 if the stream matchesthe PSRE (ie the stream consists of a single packet) otherwiseit returns undef the expression size(last) returns the size ofthe last packet in the stream The keyword last denotes the lastreceived packet in the input stream As another example (countgtk)alert returns an action alert when the total number of packets in thestream is larger than k Here count is a stream function that countsthe number of packets in the stream and alert is a pre-denedaction to generate an alert event for example to a controller nodeThese actions are a basis for implementating quantitative networkpolicies using NetQRE

A conditional expression can also be specied as exp exp1 exp2which applies exp2 to the stream if exp is not satised To ensurethe consistency exp1 and exp2 should return the same type forthe stream As an example the expression [srcip=`1001]10

returns 1 if the stream contains a single packet with source IP1001 and it returns 0 if the stream contains multiple packets orthe only packet in the stream does not have the source IP

33 Stream SplitIn many network applications the input stream consists of multiplephases For example a VoIP call splits into three phases as shown inthe motivating example It is convenient to handle dierent phasesof the stream separately and combine the results of each phase ina modular way Following the operators dened in QuantitativeRegular Expressions [9] NetQRE uses the split operator to splitthe stream and compose two stream functions

A stream split expression has the form split(fgaggop) wheref and g are two expressions and aggop is an aggregation operatorsuch as sum avg (average) max and min

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

For an input stream ρ a split function splits ρ into two sub-streams ρ1 and ρ2 such that f is dened on the rst substream ρ1and g is dened on ρ2 f and g are applied respectively to the twosubstreams and the returned value is aggregated using aggop asthe return value of split(fgaggop) The split operation is a naturalquantitative generalization of the concatenation operation in regu-lar expression and Fig 3 illustrates how a split expression works

Figure 3 Illustration of split

Note that in order to return a unique value for a split expressionit is required that the splitting is unambiguous That is for allinput streams and all values of parameters in the expression thereis at most one way to split stream for f and g The property ofunambiguity can be checked eciently at compile time [9] Whenno unambiguous splitting is possible the split expression returnsundef for the input stream

As an example of split consider the example of counting thenumber of packets since the last SYN packet in the stream Naturallyone can split the stream into two substreams separated by thelast SYN packet and count the number of packets in the secondsubstream as shown in the following expression

split(any0 last_syncount sum)

Here any is the PSRE that matches any packet streams andlast_syn is the PSRE [syn=1][syn=0] that matches a stream thatstarts with a SYN packet followed by non-SYN packets Puttingtogether the split expression splits the input stream before thelast appearance of a SYN packet Note that the rst substream isassigned the value 0 while the second substream is applied to thecount function (dened later in sect34) to count the number of packets

34 Stream IterationA network application often requires to iterate over the input streamFor example counting the number of packets in the stream requiresto iterate over all single packets

Similar to the split expression NetQRE oers the iter opera-tor to iterate over the stream An iter expression is of the formiter(faggop) where f is an expression in NetQRE and aggop is anaggregation operator The iter expression splits the input streaminto multiple substreams such that f is dened on each substreamIt then iterates through all substreams and evaluates f on each sub-stream and nally aggregates the return value on each substreamusing the aggregation operator The iter operator is a quantitativegeneralization of the Kleene star operation and Figure 4 illustratesthis process Again the splitting is required to be unambiguous forall input streams to f

Figure 4 Illustration of iter

As an example the function count that counts the number ofpackets in the stream can be specied below

sfun int count = iter (1 sum)

This function splits the stream into single packets and the innerexpression returns 1 for each single packet As a result the wholeiter expression counts how many packets in the stream

35 Aggregation over ParametersAggregation is a common feature that many network monitoringapplications share For example computing the average ow sizeneeds to aggregate the average size across multiple ows NetQREoers high-level aggregation expressions to aggregate functionsover parameters

Aggregation expressions are of the form aggopf | T x wherex is a parameter with its type T and f is an expression where theparameter x appears in it Intuitively the aggregation expressiongoes through all possible values of x and evaluate f using thesubstituted value for x and nally aggregates all valid return valuesusing the aggregation operator aggop Revisiting the example ofcounting distinct source IP addresses in a stream we can use theexpression sumexist(x)10 | IP x

36 Stream CompositionTypically the input stream consists of packets from multiple sourcesand destinations Oftentimes the programmer only wants to handlea stream from a particular source for example NetQRE oers thestream composition operator gtgt that allows to preprocess the streambefore applying another stream function

Stream composition expressions are of the form f gtgt g where f

and g are two stream functions The stream composition allows theprocessing of a stream using the rst stream function f repeatedlyon every prex of the stream and the returned outputs yield a newstream that is then piped as the input stream to a second streamfunction g For example if the stream contains two packets (P1 P2)f is rst applied to P1 as a single-packet stream and then (P1 P2)The output of f on the two substreams is then piped to g

Stream composition is useful when the stream of packets needto be preprocessed in order to lter related packets for furtherprocessing For example counting the number of all packets ina TCP connection c can be specied using filter_tcp(c)gtgt countwhich rst lters all TCP packets in the connection c using thefunction filter_tcp and then counts the ltered stream using countThe function filter_tcp is dened as follows

sfun packet filter_tcp(Conn c) =

[ is_tcp(c)] last

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Filter functions are convenient and used extensively in our usecases As a short-land NetQRE uses filter(p) for the lter functionthat lters packets satisfying the predicate p For example the abovelter function can be abbreviated as filter(is_tcp(c))

Stream composition can also be used to lter packets accordingto timestamps NetQRE builds in two lter functions based ontimestamps The recent(t) function lters the stream in the recent tseconds and the every(t) function periodically lters the stream inevery t seconds For example recent(5)gtgtcount counts the numberof packets in the recent 5 seconds Since the two build-in lterfunctions are not in the core of NetQRE we only allow the useof time-based ltering outside the high-level NetQRE operationsdescribed above

4 USE CASESWe provide use cases ranging from ow-level trac measurementsto complex examples involving application-level content analysisand dynamic updates Since the high-level constructs in NetQRElanguage is based on regular expressions we can only supportqueries for regular trac patterns Nevertheless we note that thiscovers a wide range of useful queries as we will show in this sectionMore examples can be found in our technical report

41 Flow-level Trac MeasurementsWe highlight two measurement tasks that have been proposedrecently as a means to do ow scheduling and attack detectionHeavy hitter Heavy hitters [19] are ows that consume band-width larger than a threshold T A key step to identify a heavy hitteris to count the size of a ow In this example we consider a ow as asource-destination pair A natural way to specify this functionalityis to rst lter all packets from a source x to a destination y andthen count the size of the ltered stream

sfun int hh(IP x IP y) =

filter(srcip=x dstip=y) gtgt count_size

hh rst lters packets based on the source and destinations IP andsecond the ltered stream is piped into count_size which countsthe size of all packets in the stream Due to space limits we do notshow the denition of count_size which is similar to that of count

Second to alert a heavy hitter in real time to the controller wecan use the following program

sfun action alert_hh =

(hh(lastsrcip lastdstip)gtT)

alert(lastsrcip lastdstip)

The function alert_hh checks for every newly received packet (ielast in the stream) that whether its ow reaches the threshold T andissues an alert correspondingly By further applying the time-basedltering every(5)gtgtalert_hh one can detect heavy hitters in every 5secondsSuper spreader A super spreader [40] is the host that contactsmore than k distinct destinations during a time interval Like theuse case of heavy hitter detection a key step is to count how manydistinct destinations a source x contacted The following NetQREprogram does this counting

sfun int ss(IP x) =

sum exist_pair(x y)10 | IP y

The function exist_pair(xy) checks whether the source-destinationpair (xy) appeared in the stream The ss function uses an aggre-gation function sum to aggregate all possible destinations and thusgives the total number of distinct destination addresses x contacted

Similar to heavy hitter detection we can use stream compositionto lter the input stream based on time as well as trac types Forbrevity of the paper we omit the discussion of these functionalities

42 TCP State MonitoringWe next showcase applications that rely on monitoring the stateswithin a TCP owSYN ood attacks In this example NetQRE is used to detect SYNood attacks by counting the number of incomplete TCP hand-shakes in a time interval We consider an incomplete TCP hand-shake as a packet trace consisting of a SYN packet and a SYNACKpacket with corresponding sequence number and acknowledgenumber but does not have a further ACK packet to complete thehandshake As the key step the following program counts thenumber of incomplete handshakes assuming that the input streamconsists of TCP packets between the same source and destination

sfun int bad_tcp_pat(int x int y) =

concat(syn(x) synack(yx+1) no_ack(y+1))

sfun int incomplete_handshake_num =

sumbad_tcp_pat(xy)1 | int x int y

The function bad_tcp_pat species the pattern of an incompletehandshake there is a SYN packet with a sequence number x in thestream followed by a SYNACK packet with acknowledge numberx+1 and sequence number y and then followed by packets that donot include an ACK with acknowledge number y+1 to complete thehandshake The sum aggregation in incomplete_handshake_num sums upthe number of appearances of such patterns for all x and y Usingthe stream composition of ltering functions based on time andpacket type the complete function can be specied as follows

sfun int syn_flood(Conn c) =

recent (5) gtgt

filter_tcp(c) gtgt

incomplete_handshake_num gtTblock(csrcip)

Completed ows Our next example counts the number of legiti-mate ows that are completed delineated by a SYN at the beginningand ending with a FIN Note that the last iter uses the regular ex-pression which matches a stream ending with a SYN and a FINpackets and no SYN-FIN pairs appear before Therefore the iter

expression splits the entire (ltered) stream into sessions whereeach session contains exactly one complete ow

sfun int count_flow(Conn c) =

filter_tcp(c) gtgt

filter(flag=SYN || flag=FIN) gtgt

iter ([fin =1][ syn =1][ syn =1][ fin =1]1 sum)

Slowloris attacks We describe a detection of the aforementionedSlowloris attack in NetQRE based on computing the average rateof packet transferring in all legitimate TCP connections shownbelow

sfun double conn_rate = iter(tcp_patternrate sum)sfun double avg_rate =

sumfilter_tcp(c) gtgt conn_rate | Conn c

sumcount_flow(c) | Conn c

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 5: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

For an input stream ρ a split function splits ρ into two sub-streams ρ1 and ρ2 such that f is dened on the rst substream ρ1and g is dened on ρ2 f and g are applied respectively to the twosubstreams and the returned value is aggregated using aggop asthe return value of split(fgaggop) The split operation is a naturalquantitative generalization of the concatenation operation in regu-lar expression and Fig 3 illustrates how a split expression works

Figure 3 Illustration of split

Note that in order to return a unique value for a split expressionit is required that the splitting is unambiguous That is for allinput streams and all values of parameters in the expression thereis at most one way to split stream for f and g The property ofunambiguity can be checked eciently at compile time [9] Whenno unambiguous splitting is possible the split expression returnsundef for the input stream

As an example of split consider the example of counting thenumber of packets since the last SYN packet in the stream Naturallyone can split the stream into two substreams separated by thelast SYN packet and count the number of packets in the secondsubstream as shown in the following expression

split(any0 last_syncount sum)

Here any is the PSRE that matches any packet streams andlast_syn is the PSRE [syn=1][syn=0] that matches a stream thatstarts with a SYN packet followed by non-SYN packets Puttingtogether the split expression splits the input stream before thelast appearance of a SYN packet Note that the rst substream isassigned the value 0 while the second substream is applied to thecount function (dened later in sect34) to count the number of packets

34 Stream IterationA network application often requires to iterate over the input streamFor example counting the number of packets in the stream requiresto iterate over all single packets

Similar to the split expression NetQRE oers the iter opera-tor to iterate over the stream An iter expression is of the formiter(faggop) where f is an expression in NetQRE and aggop is anaggregation operator The iter expression splits the input streaminto multiple substreams such that f is dened on each substreamIt then iterates through all substreams and evaluates f on each sub-stream and nally aggregates the return value on each substreamusing the aggregation operator The iter operator is a quantitativegeneralization of the Kleene star operation and Figure 4 illustratesthis process Again the splitting is required to be unambiguous forall input streams to f

Figure 4 Illustration of iter

As an example the function count that counts the number ofpackets in the stream can be specied below

sfun int count = iter (1 sum)

This function splits the stream into single packets and the innerexpression returns 1 for each single packet As a result the wholeiter expression counts how many packets in the stream

35 Aggregation over ParametersAggregation is a common feature that many network monitoringapplications share For example computing the average ow sizeneeds to aggregate the average size across multiple ows NetQREoers high-level aggregation expressions to aggregate functionsover parameters

Aggregation expressions are of the form aggopf | T x wherex is a parameter with its type T and f is an expression where theparameter x appears in it Intuitively the aggregation expressiongoes through all possible values of x and evaluate f using thesubstituted value for x and nally aggregates all valid return valuesusing the aggregation operator aggop Revisiting the example ofcounting distinct source IP addresses in a stream we can use theexpression sumexist(x)10 | IP x

36 Stream CompositionTypically the input stream consists of packets from multiple sourcesand destinations Oftentimes the programmer only wants to handlea stream from a particular source for example NetQRE oers thestream composition operator gtgt that allows to preprocess the streambefore applying another stream function

Stream composition expressions are of the form f gtgt g where f

and g are two stream functions The stream composition allows theprocessing of a stream using the rst stream function f repeatedlyon every prex of the stream and the returned outputs yield a newstream that is then piped as the input stream to a second streamfunction g For example if the stream contains two packets (P1 P2)f is rst applied to P1 as a single-packet stream and then (P1 P2)The output of f on the two substreams is then piped to g

Stream composition is useful when the stream of packets needto be preprocessed in order to lter related packets for furtherprocessing For example counting the number of all packets ina TCP connection c can be specied using filter_tcp(c)gtgt countwhich rst lters all TCP packets in the connection c using thefunction filter_tcp and then counts the ltered stream using countThe function filter_tcp is dened as follows

sfun packet filter_tcp(Conn c) =

[ is_tcp(c)] last

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Filter functions are convenient and used extensively in our usecases As a short-land NetQRE uses filter(p) for the lter functionthat lters packets satisfying the predicate p For example the abovelter function can be abbreviated as filter(is_tcp(c))

Stream composition can also be used to lter packets accordingto timestamps NetQRE builds in two lter functions based ontimestamps The recent(t) function lters the stream in the recent tseconds and the every(t) function periodically lters the stream inevery t seconds For example recent(5)gtgtcount counts the numberof packets in the recent 5 seconds Since the two build-in lterfunctions are not in the core of NetQRE we only allow the useof time-based ltering outside the high-level NetQRE operationsdescribed above

4 USE CASESWe provide use cases ranging from ow-level trac measurementsto complex examples involving application-level content analysisand dynamic updates Since the high-level constructs in NetQRElanguage is based on regular expressions we can only supportqueries for regular trac patterns Nevertheless we note that thiscovers a wide range of useful queries as we will show in this sectionMore examples can be found in our technical report

41 Flow-level Trac MeasurementsWe highlight two measurement tasks that have been proposedrecently as a means to do ow scheduling and attack detectionHeavy hitter Heavy hitters [19] are ows that consume band-width larger than a threshold T A key step to identify a heavy hitteris to count the size of a ow In this example we consider a ow as asource-destination pair A natural way to specify this functionalityis to rst lter all packets from a source x to a destination y andthen count the size of the ltered stream

sfun int hh(IP x IP y) =

filter(srcip=x dstip=y) gtgt count_size

hh rst lters packets based on the source and destinations IP andsecond the ltered stream is piped into count_size which countsthe size of all packets in the stream Due to space limits we do notshow the denition of count_size which is similar to that of count

Second to alert a heavy hitter in real time to the controller wecan use the following program

sfun action alert_hh =

(hh(lastsrcip lastdstip)gtT)

alert(lastsrcip lastdstip)

The function alert_hh checks for every newly received packet (ielast in the stream) that whether its ow reaches the threshold T andissues an alert correspondingly By further applying the time-basedltering every(5)gtgtalert_hh one can detect heavy hitters in every 5secondsSuper spreader A super spreader [40] is the host that contactsmore than k distinct destinations during a time interval Like theuse case of heavy hitter detection a key step is to count how manydistinct destinations a source x contacted The following NetQREprogram does this counting

sfun int ss(IP x) =

sum exist_pair(x y)10 | IP y

The function exist_pair(xy) checks whether the source-destinationpair (xy) appeared in the stream The ss function uses an aggre-gation function sum to aggregate all possible destinations and thusgives the total number of distinct destination addresses x contacted

Similar to heavy hitter detection we can use stream compositionto lter the input stream based on time as well as trac types Forbrevity of the paper we omit the discussion of these functionalities

42 TCP State MonitoringWe next showcase applications that rely on monitoring the stateswithin a TCP owSYN ood attacks In this example NetQRE is used to detect SYNood attacks by counting the number of incomplete TCP hand-shakes in a time interval We consider an incomplete TCP hand-shake as a packet trace consisting of a SYN packet and a SYNACKpacket with corresponding sequence number and acknowledgenumber but does not have a further ACK packet to complete thehandshake As the key step the following program counts thenumber of incomplete handshakes assuming that the input streamconsists of TCP packets between the same source and destination

sfun int bad_tcp_pat(int x int y) =

concat(syn(x) synack(yx+1) no_ack(y+1))

sfun int incomplete_handshake_num =

sumbad_tcp_pat(xy)1 | int x int y

The function bad_tcp_pat species the pattern of an incompletehandshake there is a SYN packet with a sequence number x in thestream followed by a SYNACK packet with acknowledge numberx+1 and sequence number y and then followed by packets that donot include an ACK with acknowledge number y+1 to complete thehandshake The sum aggregation in incomplete_handshake_num sums upthe number of appearances of such patterns for all x and y Usingthe stream composition of ltering functions based on time andpacket type the complete function can be specied as follows

sfun int syn_flood(Conn c) =

recent (5) gtgt

filter_tcp(c) gtgt

incomplete_handshake_num gtTblock(csrcip)

Completed ows Our next example counts the number of legiti-mate ows that are completed delineated by a SYN at the beginningand ending with a FIN Note that the last iter uses the regular ex-pression which matches a stream ending with a SYN and a FINpackets and no SYN-FIN pairs appear before Therefore the iter

expression splits the entire (ltered) stream into sessions whereeach session contains exactly one complete ow

sfun int count_flow(Conn c) =

filter_tcp(c) gtgt

filter(flag=SYN || flag=FIN) gtgt

iter ([fin =1][ syn =1][ syn =1][ fin =1]1 sum)

Slowloris attacks We describe a detection of the aforementionedSlowloris attack in NetQRE based on computing the average rateof packet transferring in all legitimate TCP connections shownbelow

sfun double conn_rate = iter(tcp_patternrate sum)sfun double avg_rate =

sumfilter_tcp(c) gtgt conn_rate | Conn c

sumcount_flow(c) | Conn c

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 6: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

Filter functions are convenient and used extensively in our usecases As a short-land NetQRE uses filter(p) for the lter functionthat lters packets satisfying the predicate p For example the abovelter function can be abbreviated as filter(is_tcp(c))

Stream composition can also be used to lter packets accordingto timestamps NetQRE builds in two lter functions based ontimestamps The recent(t) function lters the stream in the recent tseconds and the every(t) function periodically lters the stream inevery t seconds For example recent(5)gtgtcount counts the numberof packets in the recent 5 seconds Since the two build-in lterfunctions are not in the core of NetQRE we only allow the useof time-based ltering outside the high-level NetQRE operationsdescribed above

4 USE CASESWe provide use cases ranging from ow-level trac measurementsto complex examples involving application-level content analysisand dynamic updates Since the high-level constructs in NetQRElanguage is based on regular expressions we can only supportqueries for regular trac patterns Nevertheless we note that thiscovers a wide range of useful queries as we will show in this sectionMore examples can be found in our technical report

41 Flow-level Trac MeasurementsWe highlight two measurement tasks that have been proposedrecently as a means to do ow scheduling and attack detectionHeavy hitter Heavy hitters [19] are ows that consume band-width larger than a threshold T A key step to identify a heavy hitteris to count the size of a ow In this example we consider a ow as asource-destination pair A natural way to specify this functionalityis to rst lter all packets from a source x to a destination y andthen count the size of the ltered stream

sfun int hh(IP x IP y) =

filter(srcip=x dstip=y) gtgt count_size

hh rst lters packets based on the source and destinations IP andsecond the ltered stream is piped into count_size which countsthe size of all packets in the stream Due to space limits we do notshow the denition of count_size which is similar to that of count

Second to alert a heavy hitter in real time to the controller wecan use the following program

sfun action alert_hh =

(hh(lastsrcip lastdstip)gtT)

alert(lastsrcip lastdstip)

The function alert_hh checks for every newly received packet (ielast in the stream) that whether its ow reaches the threshold T andissues an alert correspondingly By further applying the time-basedltering every(5)gtgtalert_hh one can detect heavy hitters in every 5secondsSuper spreader A super spreader [40] is the host that contactsmore than k distinct destinations during a time interval Like theuse case of heavy hitter detection a key step is to count how manydistinct destinations a source x contacted The following NetQREprogram does this counting

sfun int ss(IP x) =

sum exist_pair(x y)10 | IP y

The function exist_pair(xy) checks whether the source-destinationpair (xy) appeared in the stream The ss function uses an aggre-gation function sum to aggregate all possible destinations and thusgives the total number of distinct destination addresses x contacted

Similar to heavy hitter detection we can use stream compositionto lter the input stream based on time as well as trac types Forbrevity of the paper we omit the discussion of these functionalities

42 TCP State MonitoringWe next showcase applications that rely on monitoring the stateswithin a TCP owSYN ood attacks In this example NetQRE is used to detect SYNood attacks by counting the number of incomplete TCP hand-shakes in a time interval We consider an incomplete TCP hand-shake as a packet trace consisting of a SYN packet and a SYNACKpacket with corresponding sequence number and acknowledgenumber but does not have a further ACK packet to complete thehandshake As the key step the following program counts thenumber of incomplete handshakes assuming that the input streamconsists of TCP packets between the same source and destination

sfun int bad_tcp_pat(int x int y) =

concat(syn(x) synack(yx+1) no_ack(y+1))

sfun int incomplete_handshake_num =

sumbad_tcp_pat(xy)1 | int x int y

The function bad_tcp_pat species the pattern of an incompletehandshake there is a SYN packet with a sequence number x in thestream followed by a SYNACK packet with acknowledge numberx+1 and sequence number y and then followed by packets that donot include an ACK with acknowledge number y+1 to complete thehandshake The sum aggregation in incomplete_handshake_num sums upthe number of appearances of such patterns for all x and y Usingthe stream composition of ltering functions based on time andpacket type the complete function can be specied as follows

sfun int syn_flood(Conn c) =

recent (5) gtgt

filter_tcp(c) gtgt

incomplete_handshake_num gtTblock(csrcip)

Completed ows Our next example counts the number of legiti-mate ows that are completed delineated by a SYN at the beginningand ending with a FIN Note that the last iter uses the regular ex-pression which matches a stream ending with a SYN and a FINpackets and no SYN-FIN pairs appear before Therefore the iter

expression splits the entire (ltered) stream into sessions whereeach session contains exactly one complete ow

sfun int count_flow(Conn c) =

filter_tcp(c) gtgt

filter(flag=SYN || flag=FIN) gtgt

iter ([fin =1][ syn =1][ syn =1][ fin =1]1 sum)

Slowloris attacks We describe a detection of the aforementionedSlowloris attack in NetQRE based on computing the average rateof packet transferring in all legitimate TCP connections shownbelow

sfun double conn_rate = iter(tcp_patternrate sum)sfun double avg_rate =

sumfilter_tcp(c) gtgt conn_rate | Conn c

sumcount_flow(c) | Conn c

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 7: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

First to compute the rate of a legitimate TCP connection we cansimply specify the pattern of a legitimate TCP connection usingRE and compute its transferring rate (the rate function can bespecied easily and thus not shown here) Then we can iterateall TCP connections given a 5-tuple and compute the aggregatedtransferring rate as shown in the function conn_rate

Second we use sum operation to aggregate transferring rate acrosstrac on all 5-tuples Dividing the aggregated rate by the numberof ows computed using the function dened above gives us theaverage transferring rate over all connections as shown in avg_rate

43 Application-level MonitoringOur nal example revisits the Voice-over-IP (VoIP) usage usecasein sect2

Let us rst focus on the function to monitor the usage of a VoIPcall based on a SIP connection sip_conn and media connection m_connAs introduced in sect2 a VoIP call consists of three phases and amodular programming way is to handle the three phases usingthree stream functions and then compose them using the split

operator The function is shown belowsfun int usage_per_call(Conn sip_conn Conn m_conn

string user string id) =

split(init(iduser m_conn)0call(m_conn)count_size

end(id)0 sum)

Here init call and end are simply the PSREs that capture the pat-terns of each phase Note that the init and the end phase return avalue 0 and only the usage of the call phase is counted as required

Next a programmer can view the trac using the SIP connectionsip_conn and the media connection m_conn as a sequence of callsUsing the iter operator the programmer can easily aggregate theusage of all calls in the trac as shown below

sfun int usage_per_conn(Conn sip_conn Conn m_conn

string user string id) =

filter_sip(sip_conn m_conn user id) gtgt

iter(split(any0 usage_per_call(sip_conn m_conn

user id) sum)sum)

Note that we rst lter all trac that belongs to sip_conn or m_conn inorder to get the stream of interest This ltering may result in non-VoIP trac between two VoIP calls and thus we simply compose thePSRE any with usage_per_call inside the iter expression to handleany trac between two calls

Now we can aggregate the usage for each user across multipleconnections and calls using the aggregation operation and furthercompute the average usage over all users as shown below

sfun int usage_per_user(string user) =

sumusage_per_conn(sip_conn m_conn user id) | Connsip_conn Conn m_conn string id

sfun int average_usage =

avgusage_per_user(u) | string u

Suppose we need to implement a quantitative network policythat sends an alert to the controller if a userrsquos usage is larger thanve times of average usage we can simply specify the policy asbelow

sfun action alert_high_usage(string user) =

(usage_per_user(user) gt 5 average_usage) alert(user

)

5 COMPILATION ALGORITHMWe next describe how to compile a NetQRE program into an e-cient executable that can run with low memory footprint Typicallythe stream of packets is very large thus it is not feasible to store theentire stream of packets Therefore the compiled program shouldevaluate incrementally on each single incoming packet withoutstoring the history of the stream To achieve this goal we have toaddress the following challenges First the compiled program needsto maintain parameters in NetQRE in a succinct way Second inorder to implement the semantics of split and iter these operationshave to be performed in a single online pass over streaming packetsLastly the compiled program needs to handle aggregation expres-sions over parameters In the following subsections we describethe compilation of selected NetQRE expressions

51 Compilation of PSREFirst we describe the algorithm for compiling a PSRE as the basicbuilding block for stream functions Similar to traditional RE aPSRE can be translated to an equivalent nite state machine whichwe refer to as parameterized symbolic automaton (PSA) For exampleconsider the following PSRE [srcip=x] Intuitively this PSREchecks whether there exists a packet with source IP x in the streamIts corresponding PSA is shown in Fig 5

Figure 5 An Example PSA

However PSA cannot be directly updated due to its uninstan-tiated parameters To illustrate this challenge consider a naivesolution which is to instantiate the parameters using all possiblevalues and maintain all the instantiated state machine namelysymbolic automata (SA) for each instantiation of the parametersWhen reading an input packet we update all the instantiated SA inthe standard way However it is not hard to note that this approachis not feasible due to the large space of parameters As an examplethe PSA in Fig 5 with a single IP parameter has up to 232 instan-tiated symbolic automata Whereas at runtime the function onlyneeds to store the source addresses that appeared in the streamwhich can be signicantly less than 232

To address this challenge we propose a new algorithm thatupdates PSA on-demand As the high-level idea suppose we instan-tiated a PSA to a set of SA using all possible instantiations Noticethat even though there is a large space of instantiated SA howevermany of them will keep the same states at runtime given a streamof packets As an example consider feeding a rst packet with thesource IP 10001 to all SA instantiated from the PSA in Figure 5 Itis easy to see that the only SA that will transition to state q1 is theone where x is instantiated by 10001 and all other SA will stay atq0

Therefore the compiled program at runtime maintains guardedstates for the PSA where a guarded state is pair (s ϕ) Here ϕ is

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 8: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

a predicate of the parameters in the PSA (which we will refer asa guard) and s is a state in PSA The pair (s ϕ) means that whenreading the input stream all instantiated SA will be at state s if theyare instantiated by the values satisfying ϕ For example initiallythere is only one guarded state maintained for the PSA in Figure 5which is (q0 true) meaning that all instantiated SA is at the initialstate q0 of the PSA

To update the PSA the key step is to update each guarded statedynamically when reading packets from the stream The updatingalgorithm for a guarded state is shown in Algorithm 1 This algo-

Algorithm 1 update_psa((s ϕ) packet)

1 for all transitions originated from s do2 let t be the destinate state and P be the parameterized predi-

cate3 let T be the guard instantiated from P using the packet4 let ϕ prime = ϕ andT5 emit (t ϕ prime) if ϕ false6 end for

rithm iterates through all transitions from the state s and for thepredicate (with parameters) P on the transition it instantiates Pusing the current packet it receives in the stream This step simplyreplaces the function name in P by the return value of it on thepacket The obtained predicateT is a predicate over the parametersin the PSA Finally the algorithm emits a new pair (t ϕ prime) if ϕ prime is notfalse Intuitively this step accounts the fact that the instantiatedSA under ϕ prime will transition to the next state t

Using Algorithm 1 the overall updating algorithm is straight-forward Initially the compiled program only contains a guardedstate (q0 true) where q0 is the initial state of the PSA Every timereading a new packet from the stream we call Algorithm 1 on everyguarded state and the new guarded states consists of all the onesemitted from Algorithm 1 To check the state given concrete valuesfor parameters we simply go through all maintained guarded statesand return the state if the guard is satisedExample Using the example in Figure 5 we illustrate the execu-tion of Algorithm 1 given the pair (q0 true) and the packet withsource IP 10001 Consider the transition from q0 to q1 By sub-stituting srcip in the predicate with the source IP of the packetwe get the guard T which is x=10001 Since ϕ is true now ϕ prime issimply x=10001 At the end the algorithm will emit the pair (q1x=10001) Similarly for the other transition the algorithm emit thepair (q0 x=10001) Therefore there are two new guarded statesnamely (q1 x=10001) and (q0 x=10001) The updated guardedstates account for the fact that 10001 has appeared and thus thecorresponding instantiated SA transitioned to state q1

52 Compilation of split

We next discuss how to compile a split expression which is of theform split(f g aggop) First we highlight the key insight from [9]to compile split without parameters and then describe how togeneralize this idea to compile split with parametersWithout parameters The challenge of compiling split is to splitthe stream dynamically at runtime without revisiting the entirehistory of the stream For example in the split example in sect33 we

need to split the stream at the last appearance of the SYN packet Anatural way to implement the semantics of split is to maintain allcases to split the stream For example Fig 6 shows two cases to splitthe stream consisting of a non-SYN and a SYN packet at runtimefor the example in sect33 Moreover the following two observationsensure that the compiled code only uses a constant space First eachcase can be represented using a triple (sf sд F ) where F is a agindicating whether the stream is split and sf (sд resp) is the stateof f (д resp) on evaluating the prex(sux resp) of the streamSecond the number of maintained (distinct) cases is bounded by aconstant only related to д at any time point

Figure 6 Example run of split

With parameters Similar to PSRE compilation in the generalscenario we need to maintain guarded cases That is we maintaina set of (T ϕ) pairs where T is a triple (sf sд F ) correspondingto a case as dened above and ϕ is a guard It means that if theparameters are instantiated by values satisfying ϕ there is a caseof splitting the stream which can be represented as T

Algorithm 2 update_split((T ϕ) packet)

1 suppose T = (sf sд F )2 if F is false then3 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do4 if f is dened on s primef then

5 emit ((s primef s0д true) ϕ prime) s0д is the initial state of g 6 end if7 emit ((s primef sд false) ϕ prime)8 end for9 else

10 for all emitted (s primeд ϕ prime) from update_g((sд ϕ) packet) do11 emit ((sf s primeд true) ϕ prime)12 end for13 end if

Algorithm 2 shows the algorithm to update a pair (T ϕ) Thealgorithm feeds the input packet to f or g corresponding to whetherthe stream has been split as indicated by F For example whenF is false namely the stream has not been split yet in this casethe algorithm need to update f on the packet (line 2-9) In thiscase for each emitted guarded state (s primef ϕ prime) from the update of fa corresponding guarded case is emitted (line 7) Moreover if f isdened on s primef we may split the stream at the current position andthus emit the guarded case ((s primef s0д true) ϕ prime) (line 4-6) Here s0д isthe initial state of g and the ag F is set to true to indicate thatthe stream is split For the else branch from line 10 to 12 in thealgorithm we need to update (sд ϕ) using grsquos updating procedurewhich may emit a set of guarded states of g Correspondingly theguard needs to be rened

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 9: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

Given concrete values for parameters evaluating a split expres-sion is as follows First we check if there exist a guarded case ((sf sд F )ϕ) such that ϕ is satised and both f and g are dened on thestates in the case Then we evaluate the two expressions based ontheir states in the case and nally aggregate the evaluated valuesusing aggop If no such guarded cases exist the split expression isnot dened

53 Compilation of iter

We rst review the key ideas of evaluating an iter expression whenthere are no parameters [9] and then describe how to handle pa-rameters

Consider an iter expression iter(f aggop) without any param-eters Similar to split expressions the iter expression needs tomaintain all possible cases of splitting the input stream Thoughthe number of such cases is only determined by f (thus not relevantto the input stream) how to succinctly represent each case is achallenge Specically unlike a split expression which only splitsthe stream into a prex and a sux an iter expression needs tosplit the stream into multiple substreams such that each one canbe applied to the expression f Therefore naively maintaining allthe states of f for each substream may use a large space as thestream grows Fortunately our choice of the aggregation operatorsallows us to summarizes the application of f to all substreams butthe last one using the aggregated return value For example if aggopis sum we only need to remember the aggregated sum on thesesubstreams Thus we can represent a case as a pair (v sf ) wherev is the aggregated value and sf is the state of f on evaluating thelast substream

In the general scenario with parameters we simply maintain aguarded case as ((v sf ) ϕ) The update of a guarded case ((v sf )ϕ)is shown in Algorithm 3 The evaluation of f given guarded casesis similar to that of a split expression

Algorithm 3 update_iter((T ϕ) packet)

1 suppose T = (v sf )

2 for all emitted (s primef ϕ prime) from update_f((sf ϕ) packet) do3 if f is dened on s primef then4 let v prime be the evaluated value of f on s primef5 emit ((v primeprime s0f ) ϕ prime) where v primeprime=aggop (vv prime) and s0f is the

initial state of f

6 end if7 emit ((v s primef ) ϕ prime)8 end for

54 Compilation of AggregationNow we describe the aggregation function aggopf | T x Followingthe denition of the aggregation function an aggregation functionmaintains guarded states (sf ϕ) To update the aggregation expres-sion we just need to update each guarded state using frsquos updatingprocedure To illustrate the evaluation procedure suppose the ag-gregation operator is sum Given values for the parameters we needto iterate through all guarded states (sf ϕ) If ϕ is satised we eval-uate f on sf say the return value is v and count the number of

values of x which satises ϕ say it is k we sum up v lowast k into theaggregated sum For other aggregation operators the evaluationprocess works similarly

55 Compilation of Stream CompositionLastly we discuss stream composition Recall that stream composi-tion allows to process the input stream using a function f and thenapply the second function g on the processed stream Following itsdenition each time a new packet arrives we need to rst processit with f and the returned value (eg a ltered packet) is then fedinto g

Therefore the stream composition expression fgtgtg needs to main-tain guarded states ((sf sд )ϕ) Algorithm 4 shows the algorithm toupdate a guarded state following the intuition described above Theevaluation of the expression is similar to that of previous discussedexpressions

Algorithm 4 update_comp((S ϕ) packet)

1 suppose S = (sf sд )2 for all emitted (s primef ϕ prime) from update_f((sf ϕ)packet) do3 if f is dened on s primef then4 let ret be the returned packet of f on s primef5 emit ((s primef s primeд ) ϕ primeprime) for all emitted (s primeд ϕ primeprime) from

update_g((sд ϕ prime) ret)6 else7 emit ((s primef sд ) ϕ prime)8 end if9 end for

6 IMPLEMENTATIONWe have implemented a prototype of the NetQRE system in C++which consists of two main components namely the compiler forNetQRE and the NetQRE runtimeNetQRE Compiler The NetQRE compiler implements the com-pilation algorithm described in sect5 The compiler rst generates aC++ program from an input NetQRE program which is then com-piled by the gcc compiler into executable Our compiled programuses a tree data structure to represent predicates over parametersThe choice of tree is driven by its simplicity ease of encoding andlookup performance

In addition to the basic compilation algorithm we include ad-ditional optimizations to the compiler First we use the standardminimization algorithm to minimize the state machine for a regularexpression Second for aggregation expressions with sum and avgwe use an incremental updating algorithm to update the expres-sion we maintain the running sum as the state of the aggregationexpression and we update the sum incrementally when the stateof the inner expression is updatedNetQRE Parallelization The NetQRE compiler can optionallycompile the NetQRE specication into parallelized implementationsbased on the instantiation of parameters in the NetQRE specica-tion For the example described in sect 51 a packet with source IPaddress s can be processed by the hash(s )-th thread which handlesthat instantiation of the parameter x

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 10: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

NetQRERuntime The NetQRE runtime includes a packet captureagent implemented using the pcap library [28] Each packet thatarrives at the runtime is parsed and then processed by the compiledNetQRE program as shown in Figure 1 Currently our runtimesupports actions that include sending alert events to a controllerand directly installing a rule on a switch The NetQRE runtime isnot specically optimized for fast packet capture and processingand as future work we plan to explore the use of DPDK [24] Ourcurrent implementation analyzes every packet though orthogonalsampling techniques can be also explored in future

7 EVALUATIONWe evaluate our NetQRE prototype centered around answeringthree key questions First can the NetQRE language express a widerange of quantitative network monitoring applications in a conciseand intuitive manner Second is the generated code ecient interms of throughput and memory footprint Third can NetQRE beused in a real-time monitoring setting that provides rapid mitigationbased on output of quantitative network monitoring applicationsUnless otherwise specied all our experiments are carried out ona cluster of commodity servers each of which has 10-core 26GHzXeon E5-2660V3 Each core has a 256KB L2 cache and shared 25MB L3 cache The memory size is 64 GB The OS is Ubuntu 1404and the kernel version is 420

71 ExpressivenessWe have implemented a set of quantitative network monitoring ap-plications using NetQRE The examples are drawn from a literaturesurvey on quantitative network monitoring applications [13 18 3540 41] Table 1 summarizes the example applications in NetQREthat we use as well as and the lines of code of each NetQRE pro-gram

LoCHeavy Hitter (sect41) 6Super Spreader (sect41) 2Entropy Estimation [40] 6Flow size dist [18] 8Trac change detection [35] 10Count trac [40] 2Completed ows (sect42) 6SYN ood detection (sect42) 9Slowloris detection (sect42) 12Lifetime of connection 8Newly opened connection recently 11 duplicated ACKs 5 VoIP call 7VoIP usage (sect43) 18Key word counting in emails 11DNS tunnel detection [12] 4DNS amplication [20] 4

Table 1 Examplemonitoring applicationsNetQRE supports

We make the following observations First NetQRE is able toexpress a large variety of quantitative network monitoring applica-tions ranging from ow-level trac measurement to application-level quantitative monitoring We validate the correctness of ourapplications by running them and comparing their output withhand-crafted implementations Second programming in NetQRE isconcise All example programs can be specied within 18 lines ofcode in NetQRE This count includes commonly used lter functionsas well as regular expressions which can be built into a library forreference As an interesting comparison we encoded the VoIP calluse case in Bro [33] a well-known intrusion detection system Brorequired 51 lines of code as compared to only 7 in NetQRE How-ever Bro is unable to easily support the VoIP usage case written inNetQRE In particular Bro cannot use patterns to separate packetsinto dierent VoIP sessions for the purposes of counting bandwidthconsumption per session This is not surprising as Brorsquos primaryuse is that of an intrusion detection system We validated the cor-rectness of our implementation on actual SIP trac by comparingBrorsquos output against NetQRErsquos

72 PerformanceOur next set of experiments evaluate the performance of NetQRErsquoscompiled implementations NetQRE runs on a single machine basedon the setup described earlier We use a single core to measureNetQRErsquos performance As a point of comparison we compare eachNetQRE program against an equivalent carefully hand-crafted C++implementation We note that all our hand-crafted implementationsrequire at least 100 lines of code and often times require users toexplicit manage internal state across packets ndash a programming taskthat NetQRE abstracts away

Fig 7 shows the performance of NetQRE implementations (NetQREin the gure) compared with manually optimized implementations(Baseline in the gure) that we spent days optimizing Our examplesinclude heavy hitter super spreader entropy SYN ood detectioncompleted ows count and Slowloris use cases presented in sect4 Asworkload we use a one-minute CAIDA trac trace [7] capturedon a high-speed Internet backbone link which contains 37 millionpackets (ie about 620k packetssec) We replay the trac traceto the implementations Since the trac trace does not containpayloads we report the throughput in million packets per second(MPPS)

We make the following observations First the NetQRE imple-mentations are able to achieve line-rate processing speed withoutbeing the bottleneck In particular for three out of six applicationsthe throughput of NetQRE implementations is about 20MPPS usinga single thread The average packet size of our input trac traceis 888B For a 10Gbps network 14M packets of size 888B can beforwarded per second which is well below the throughput achievedby NetQRE The throughput can be further improved by paralleliza-tion (presented later in this section) Second the compiled NetQREimplementation incurs negligible overhead compared with the man-ually optimized low-level implementation The dierence betweenthe throughput of the compiled NetQRE implementation and thatof the manual implementation is within 9 across all use cases westudied Third NetQRE incurs slightly increase (up to 60) in terms

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 11: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

05

10152025

Thro

ughp

ut (M

PPS)

Baseline NetQRE OpenSketch

(a) Throughput

1

10

100

1000

Mem

ory

(MB

) Baseline NetQRE OpenSketch

(b) Memory

Figure 7 Performance comparison

0

10

20

30

40

0 2 4 6 8

Thro

ughp

ut (M

PPS)

Number13 of13 Threads

super spreadersyn floodslowloris

Figure 8 Throughput with paral-lelization

of memory footprint (Fig 7b) compared with the manually opti-mized implementation However the overall memory footprint ofNetQRE remains low We have also run the experiments on anotherCAIDA trac [2] and the comparison result is similarComparison with OpenSketch In addition to comparisons withour manually optimized code we also compare with OpenSketch [40]a popular trac measurement tool based on sketching techniquesGiven OpenSketchrsquos limitation in handling stateful monitoringtasks we only compare the performance of NetQRE implementa-tions on the heavy hitter and super spreader examples with thatof OpenSketch (in software) We use the same CAIDA trac traceas before and follow the default setting of the reference code ofOpenSketch [4] for the two examples The NetQRE compiled imple-mentations signicantly outperform OpenSketch implementationsThe throughput of NetQRE compiled code is 11times and 18times as thatof OpenSketch (Fig 7a) As OpenSketch focuses on optimizingmemory usage it is not surprising that OpenSketch uses less mem-ory than NetQRE implementations Nevertheless we observe thatNetQRE uses only 11 more memory on the super spreader examplethan that of OpenSketch (Fig 7b) Note that we use an open-sourceversion of OpenSketch software that runs on a similar x86 machineas NetQRE in order to have an ldquoapples-to-applesrdquo comparison onsimilar hardware NetQRE may also exploit similar optimizationsperformed by OpenSketch in order to get higher performance ona NetFPGA Exploiting a hardware platform such as NetFPGA inorder to further improve NetQRE performance is an interestingavenue for future workComparison with Bro We further compare with Bro Given thelimitations of Bro in handling the complete VoIP use cases we sim-plify the use case to simply counting the number of VoIP calls (asopposed to bandwidth consumption) per user NetQRE compiledimplementation can nish counting within 1 second while Brotakes about 23 seconds for a SIP trac trace containing 4338 VoIPcalls Both programs output the same results for this use case Webelieve there are at least two reasons for Brorsquos slower performanceFirst Bro is designed for intrusion detection and not aimed atand optimized for monitoring tasks Second unlike NetQRE whichcompiles high-level code to ecient low-level code Bro uses aninterpreter to execute the script written in Brorsquos language whichcould result in considerable overheadParallelization speedup NetQRE compiler automatically com-piles parallelized implementations as described in sect6 In the con-sidered examples parallelization involves sending ows of packets

to dierent NetQRE instances running on dierent threads Fig 8plots the throughput of the NetQRE implementations with dierentnumbers of threads for the super spreader syn ood and slowlorisdetection examples In all examples the NetQRE parallelized codeis able to achieve at least 39times speedup with 8 threads Note thatlinear speedup is not achieved as the trac (in our nite trace) isnot uniformly distributed across all 8 threads This is an artifactof the trace data itself When we include the overhead of the soft-ware load balancer that dispatches packet ows to dierent threadsthe NetQRE implementations achieve at least 26times speedup for allexamples This overhead can be mitigated with a hardware load-balancer Overall we observe that NetQRErsquos throughput is morethan sucient to handle line-rate trac and the NetQRE runtimeitself is not the bottleneck

73 End-to-end ValidationIn the nal set of experiments we validate the end-to-end use ofNetQRE for enforcing quantitative network policies which appliesactions to quantitative network monitoring programs We set upa network of two clients C1 and C2 one server S and one SDNswitch that mirrors trac to a NetQRE runtime running the SYNood detector presented in sect42 In addition a NetQRE controllerbased on POX [29] controls the switch The network is emulatedusing Mininet [25] with link bandwidth set to 100Mbps

In the experimentC1 sends normal trac to S using iperf at rate1Mbps and at the 7th second C2 starts SYN ood attack generatedusing our generator to the same server The attack is detected byNetQRE which generates an alert event to the controller whichsubsequently installs a forwarding rule on the switch to block tracfrom C2 To illustrate the attack detection and blocking Figure 9a(left gure) shows the bandwidth utilization (Mbps) on the serverWe observe that NetQRE successfully blocks C2rsquos attacking tracin real-time

As a second experiment using a similar setup our NetQRE heavyhitter program (sect41) running at a switch-side NetQRE runtimemonitors the trac over a sliding window of 5 seconds and issuesan alert to the controller when detecting a heavy hitter whichfurther blocks the trac As a point of comparison we compareour approach against two alternatives (1) sending all packets to thecontroller which runs an equivalent heavy hitter detection program(2) monitoring the ow counters on the switch to detect heavyhitters as considered in other languages [30] and systems [31] Inthe second alternative the controller reads the counter every 1

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 12: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

0 2 4 6 8 10 12 14 16Time (sec)

0

1

2

3

4

5

6B

andw

idth

(Mbp

s)NetQRE

(a) SYN ood

0 2 4 6 8 10 12 14 16Time (sec)

0

2

4

6

8

10

12

14

Ban

dwid

th (M

bps)

NetQRE

stats

forward

(b) Heavy hitter

0 20 40 60 80 100Time (sec)

0

1

2

3

4

5

6

7

Ban

dwid

th (M

bps)

NetQRE

(c) VoIP

Figure 9 Bandwidth utilization (Mbps) at the server

second and compute the bandwidth usage for each ow over asimilar 5 seconds window

Figure 9b (middle gure) plots the bandwidth utilization of theserver The above alternative solutions are labeled as forward andstats respectively Compared with these two approaches our pro-posed approach can detect the heavy hitter and further respond toit in a more timely fashion Moreover compared with the forwardapproach we send signicantly fewer trac to the controller andis a more scalable solution

In our nal experiment we validate the VoIP use case rst pre-sented in our introduction We use a synthetic VoIP trac tracegenerated using SIPp [5] replay a VoIP call (with accompanyingH264 MPEG video) made from a single user via client C2 We re-play the trac at 5Mbps and run iperf on C1 as the backgroundtrac Our NetQRE program enforces a policy that blocks a callafter the user making the call exceeds a SIP bandwidth usage of1875MB (around 30 seconds) We congure the NetQRE programto send an alert to the NetQRE controller which then blocks thetrac Figure 9c (right gure) plots the bandwidth utilization of theserver Again this veries that NetQRE can be used to intercept aSIP session monitor usage on a per-user basis and react to highusage in a timely fashion

74 Summary of EvaluationRevisiting the questions at the beginning of our evaluation weobserve that (1) NetQRE can express a wide range of quantita-tive monitoring applications with a few lines of code (2) achievesline-rate throughput performance which is comparable to that ofcarefully hand-crafted optimized low-level code while signicantlyoutperforms other measurement and IDS tools and (3) can be usedin an SDN setting to monitor network trac and update switchesin real-time in response to specied NetQRE policies

8 DISCUSSIONIn this section we discuss some of NetQRErsquos limitations and alsosome future work

Expressiveness As a language focusing on monitoring policyNetQRE cannot specify general network policies such as servicechaining policies and trac engineering policies As an interesting

future work it might be possible to extend NetQRErsquos actions insupport of more complex policies

Given NetQRErsquos basis on regular expressions NetQRE is notsuited for monitoring tasks that can not be specied using regulartrac patterns Though this is a fundamental limitation of NetQRErsquosexpressiveness we believe that the high-level constructs in NetQREprovides a natural programming abstraction to write tasks withhierarchical regular structures and NetQRE can specify a widerange of monitoring tasks as shown in sect4

Handling packet lossreorderingretransmission For the perfor-mance consideration NetQRE does not buer packets in the streambut processes packets in a streaming fashion Thus if packets inthe stream are lost reordered or retransmitted the query speciedin NetQRE might not be evaluated correctly since the internal statemay transition to a wrong one Current design of NetQRE relies onthe runtime to handle packet loss reordering and retransmission

Network-widemonitoring For the similar reasons discussed abovecurrent deployment model of NetQRE is based on a centralize fash-ion in order to receive all (ordered) packet in the stream of interestFor example a ow counting task has to be deployed at a placewhere all packets in a ow can traverse This is not a fundamentallimitation as one can easily deploy NetQRE in a distributed fashionprovided that the query is insensitive to packet reordering in thestream We leave it as future work to study the decomposition of acentralized NetQRE program into distributed queries

9 RELATEDWORKQuantitative regular expressions NetQRE is inspired by quan-titative regular expressions (QRE) [9] which provides a theoreticalfoundation for regular programming of data streams NetQRE ex-tends QRE in the following ways First NetQRE introduces param-eters in support of queries handling unknown values (eg countingdistinct sources) Second NetQRE proposes aggregation operatorsto compute aggregated values across parameters It is shown insect4 that both extensions are essential to specify a wide range ofinteresting network monitoring policies

StreamQRE [27] is another recently proposed extension to QRENetQRE is dierent from StreamQRE in several aspects First NetQREis specialized to networking applications while StreamQRE focuseson database application Second StreamQRE allows relations as

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 13: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

antitative Network Monitoring with NetQRE SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA

types and supports relational operations such as join and key-basedpartitioning These operations cannot be evaluated eciently in astreaming fashion in general NetQRE instead uses parameters andaggregation over parameters with a focus on ecient evaluationIntrusion detection systems and protocol analyzers Thereare a number of key dierences between NetQRE and intrusiondetection systems (IDS) and protocol analyzers First systems suchas Snort [34] allow regular expression matches on single packetsbut is not perform regular expression matches that span multiplepackets let alone track sessions across packets While Bro [33] sup-ports aspects of NetQRE the scripting language requires complexprocedural programming where one has to maintain states anddata structures as opposed to NetQRErsquos declarative programmingapproach with high-level constructs over packet streams As anIDS Bro does not lend itself naturally to handle some quantitativemonitoring use cases eg the VoIP usage example While HILTI [37]uses Bro and oers abstract machine models for trac analysis itrequires the operator to implement low-level state machinesDomain specic languages in networking There are severalrecent proposal on domain specic languages on networks whichincludes Frenetic [21] Pyretic [30] NetKAT [10] Flowlog [32]Maple [38] Network Datalog [26] To the best of our knowledgenone of these systems support monitoring queries beyond basicow-level counters provided by OpenFlow

Recently proposed SNAP [11] and Sonata [23] oer abstrac-tions for trac monitoring over packet streams but in a morelow-level procedural or SQL-like way NetQRE oers a complemen-tary abstraction on quantitative queries It may be interesting toincorporate NetQRE with these languages in support for complexquantitative queriesStreaming database languagesDatabase-style query languages [1416] provide SQL-like language support for running continuousqueries over data streams While there are constructs to do aggrega-tion over sliding windows they are designed with simple relationalqueries over packet headers or packet counts and cannot handlethe complex queries based on trac patterns that NetQRE supports

10 CONCLUSIONThis paper presents NetQRE a high-level declarative language andpractical toolkit for quantitative network monitoring NetQRE of-fers an intuitive programming model using regular expressionsover streams of packets and can naturally express ow-level aswell as application-level quantitative policies We developed a com-pilation algorithm that can compile NetQRE programs into ecientimperative programs Our evaluation results demonstrate the ex-pressiveness of NetQRE in support a wide range of quantitativenetwork monitoring applications its ability to match carefully opti-mized manually written low level code and outperform equivalentimplements in Bro and OpenSketch A proof-of-concept scenariowith an SDN controller and switches demonstrate NetQRErsquos abilityto support timely and ecient enforcement of quantitative networkpolicies

ACKNOWLEDGEMENTWe would like to thank our shepherd Arjun Guha for his helpfulcomments We also want to thank the anonymous reviewers for

their insightful feedbacks on this work This research was partiallysupported by NSF Expeditions in Computing award CCF 1138996NSF CNS-1513679 and NSF CNS-1218066

REFERENCES[1] Application Layer Packet Classier for Linux httpwwwmcafeecomus

productsnetwork-security-platformaspx[2] CAIDA Trac Trace httpsdatacaidaorgdatasets securityddos-20070804 [3] McAfee Network Security Platform http l7-ltersourceforgenet [4] OpenSketch reference code httpsgithubcomUSC-NSLopensketch[5] SIPp http sippsourceforgenet [6] SSL renegotiation DoS httpswwwietforgmail-archivewebtlscurrent

msg07553html[7] Anonymized 2015 Internet Traces httpsdatacaidaorgdatasetspassive-2015

2015[8] Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan Nelson

Huang and Amin Vahdat Hedera Dynamic Flow Scheduling for Data CenterNetworks In NSDI volume 10 pages 19ndash19 2010

[9] Rajeev Alur Dana Fisman and Mukund Raghothaman Regular Programmingfor Quantitative Properties of Data Streams In 25th European Symposium onProgramming ESOP 2016

[10] Carolyn Jane Anderson Nate Foster Arjun Guha Jean-Baptiste Jeannin DexterKozen Cole Schlesinger and David Walker NetKAT Semantic foundations fornetworks In Proceedings of the 41st annual ACM SIGPLAN-SIGACT symposiumon Principles of programming languages pages 113ndash126 ACM 2014

[11] Mina Tahmasbi Arashloo Yaron Koral Michael Greenberg Jennifer Rexford andDavid Walker Snap Stateful network-wide abstractions for packet processingIn Proceedings of the 2016 ACM SIGCOMM Conference SIGCOMM rsquo16 pages29ndash43 New York NY USA 2016 ACM

[12] Kevin Borders Jonathan Springer and Matthew Burnside Chimera A declarativelanguage for streaming network trac analysis In Proceedings of the 21st USENIXConference on Security Symposium Securityrsquo12 pages 19ndash19 Berkeley CA USA2012 USENIX Association

[13] Varun Chandola Arindam Banerjee and Vipin Kumar Anomaly Detection ASurvey ACM computing surveys (CSUR) 41(3)15 2009

[14] Sirish Chandrasekaran Owen Cooper Amol Deshpande Michael J FranklinJoseph M Hellerstein Wei Hong Sailesh Krishnamurthy Samuel R MaddenFred Reiss and Mehul A Shah TelegraphCQ Continuous Dataow ProcessingIn Proceedings of the 2003 ACM SIGMOD International Conference on Managementof Data SIGMOD rsquo03 pages 668ndash668 New York NY USA 2003 ACM

[15] Benoit Claise Cisco systems NetFlow services export version 9 2004[16] Chuck Cranor Theodore Johnson Oliver Spataschek and Vladislav Shkapenyuk

Gigascope A Stream Database for Network Applications In Proceedings of the2003 ACM SIGMOD International Conference on Management of Data SIGMODrsquo03 pages 647ndash651 New York NY USA 2003 ACM

[17] Luca Deri Open source VoIP trac monitoring In Proceedings of SANE volume2006 2006

[18] Nick Dueld Carsten Lund and Mikkel Thorup Estimating Flow Distributionsfrom Sampled Flow Statistics In Proceedings of the 2003 conference on Applicationstechnologies architectures and protocols for computer communications pages 325ndash336 ACM 2003

[19] Cristian Estan and George Varghese New Directions in Trac Measurement andAccounting volume 32 ACM 2002

[20] Seyed K Fayaz Yoshiaki Tobioka Vyas Sekar and Michael Bailey BohateiFlexible and Elastic DDoS Defense In 24th USENIX Security Symposium (USENIXSecurity 15) pages 817ndash832 Washington DC August 2015 USENIX Association

[21] Nate Foster Rob Harrison Michael J Freedman Christopher Monsanto JenniferRexford Alec Story and David Walker Frenetic A Network ProgrammingLanguage In ACM SIGPLAN Notices volume 46 pages 279ndash291 ACM 2011

[22] Pedro Garcia-Teodoro J Diaz-Verdejo Gabriel Maciaacute-Fernaacutendez and EnriqueVaacutezquez Anomaly-based Network Intrusion Detection Techniques Systemsand Challenges computers amp security 28(1)18ndash28 2009

[23] Arpit Gupta Ruumldiger Birkner Marco Canini Nick Feamster Chris Mac-Stokerand Walter Willinger Network Monitoring As a Streaming Analytics ProblemIn Proceedings of the 15th ACM Workshop on Hot Topics in Networks HotNets rsquo16pages 106ndash112 ACM 2016

[24] DPDK Intel Data Plane Development Kit httpdpdkorg[25] Bob Lantz Brandon Heller and Nick McKeown A Network in a Laptop Rapid

Prototyping for Software-dened Networks In Proceedings of the 9th ACMSIGCOMM Workshop on Hot Topics in Networks Hotnets-IX pages 191ndash196ACM 2010

[26] Boon Thau Loo Tyson Condie Minos Garofalakis David E Gay Joseph MHellerstein Petros Maniatis Raghu Ramakrishnan Timothy Roscoe and IonStoica Declarative Networking CACM 2009

[27] Konstantinos Mamouras Mukund Raghotaman Rajeev Alur Zachary G Ivesand Sanjeev Khanna StreamQRE Modular Specication and Ecient Evaluation

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References
Page 14: Quantitative Network Monitoring with NetQREalur/Sigcomm17.pdf · 2017. 6. 26. · In network management today, ... Thau Loo. 2017. Quantitative Network Monitoring with NetQRE. In

SIGCOMM rsquo17 August 21-25 2017 Los Angeles CA USA Y Yuan D Lin A Mishra S Marwaha R Alur and BT Loo

of Quantitative Queries over Streaming Data In Proceedings of the 38th ACMSIGPLANConference on Programming Language Design and Implementation ACM2017

[28] Steve McCanne Craig Leres and Van Jacobson Libpcap httpwwwtcpdumporg1989

[29] J Mccauley POX A Python-based Openow Controller 2014[30] Christopher Monsanto Joshua Reich Nate Foster Jennifer Rexford David Walker

et al Composing Software Dened Networks In NSDI pages 1ndash13 2013[31] Masoud Moshref Minlan Yu Ramesh Govindan and Amin Vahdat DREAM

dynamic resource allocation for software-dened measurement In Proceedingsof the 2014 ACM conference on SIGCOMM pages 419ndash430 ACM 2014

[32] Tim Nelson Andrew D Ferguson Michael JG Scheer and Shriram KrishnamurthiTierless Programming and Reasoning for Software-Dened Networks NSDIApr 2014

[33] Vern Paxson Bro A System for Detecting Network Intruders in Real-timeComput Netw 31(23-24)2435ndash2463 December 1999

[34] Martin Roesch et al Snort Lightweight Intrusion Detection for Networks InLISA volume 99 pages 229ndash238 1999

[35] Vyas Sekar Michael K Reiter and Hui Zhang Revisiting the case for a minimalistapproach for network ow monitoring In Proceedings of the 10th ACM SIGCOMM

conference on Internet measurement pages 328ndash341 ACM 2010[36] David Senecal Slow DoS on the rise httpsblogsakamaicom201309

slow-dos-on-the-risehtml[37] Robin Sommer Matthias Vallentin Lorenzo De Carli and Vern Paxson HILTI

An Abstract Execution Environment for Deep Stateful Network Trac AnalysisIn Proceedings of the 2014 Conference on Internet Measurement Conference IMCrsquo14 pages 461ndash474 New York NY USA 2014 ACM

[38] Andreas Voellmy Junchang Wang Y Richard Yang Bryan Ford and Paul HudakMaple Simplifying SDN Programming using Algorithmic Policies In Proceedingsof the ACM SIGCOMM 2013 conference on SIGCOMM pages 87ndash98 ACM 2013

[39] Mea Wang Baochun Li and Zongpeng Li sFlow Towards resource-ecient andagile service federation in service overlay networks In Distributed ComputingSystems 2004 Proceedings 24th International Conference on pages 628ndash635 IEEE2004

[40] Minlan Yu Lavanya Jose and Rui Miao Software Dened Trac Measurementwith OpenSketch In NSDI volume 13 pages 29ndash42 2013

[41] Lihua Yuan Chen-Nee Chuah and Prasant Mohapatra ProgME Towards Pro-grammable Network Measurement IEEEACM Transactions on Networking (TON)19(1)115ndash128 2011

  • Abstract
  • 1 Introduction
  • 2 Overview
    • 21 Motivating Example
    • 22 Alternative Approaches
      • 3 The NetQRE Language
        • 31 Pattern Matching over Streams
        • 32 Conditional Expressions
        • 33 Stream Split
        • 34 Stream Iteration
        • 35 Aggregation over Parameters
        • 36 Stream Composition
          • 4 Use Cases
            • 41 Flow-level Traffic Measurements
            • 42 TCP State Monitoring
            • 43 Application-level Monitoring
              • 5 Compilation Algorithm
                • 51 Compilation of PSRE
                • 52 Compilation of split
                • 53 Compilation of iter
                • 54 Compilation of Aggregation
                • 55 Compilation of Stream Composition
                  • 6 Implementation
                  • 7 Evaluation
                    • 71 Expressiveness
                    • 72 Performance
                    • 73 End-to-end Validation
                    • 74 Summary of Evaluation
                      • 8 Discussion
                      • 9 Related Work
                      • 10 Conclusion
                      • References