the agilent protocol encoder (ape) · the agilent protocol encoder (ape) 3. to be tailored to the...

The Agilent Protocol Encoder (APE)

KEVIN MITCHELL

Agilent Laboratories

The Agilent Protocol Encoder (APE) was designed as a test-bench to explore various aspects of protocol specification. In

addition to designing a domain-specific language for specifying communication protocols, the project built an integrated devel-

opment environment (IDE) to assist in the construction and debugging of such specifications. The concept of action-directeddecoding was developed to allow a range of different decoders to be constructed from the same specification. The compiler

can generate code for the APE virtual machine (APE VM), with an instruction set tailored to the task of protocol decoding.

A firmware implementation of this VM has also been developed, along with a more traditional back-end that generates C++code. This paper summarizes the main contributions of the APE project. A companion report describes the design of the APE

virtual machine and its implementation in firmware.

Key Words and Phrases: Domain-specific languages, protocols, integrated development environments, action-directed decoding

ACKNOWLEDGMENTS

A project of this size is a team effort. The author would particularly like to thank Tony Kirkham who, in addition

to being responsible for the APE virtual machine, contributed to many other areas of the system. Bill Huynh did a

great job of merging the APE debugger into the NetBeans debugging framework. Graham Pollock and Ji Liang also

contributed to the early stages of this work. The project was managed by Kathy Graham.

1. INTRODUCTION

The original purpose of the APE project was to develop

• a new language for specifying communications protocols

• a tool chain to support the manipulation of specifications written in this language

• translators into high-quality decoders and encoders, both in software and firmware.

This paper describes the APE language, the motivation behind its design, the principle achievements ofthe project, and the areas that have been identified for future exploration. But first, let’s start at thebeginning. . .

protocol. A set of formal rules describing how to transmit data, especially across a network. Lowlevel protocols define the electrical and physical standards to be observed, bit- and byte-orderingand the transmission and error detection and correction of the bit stream. High level protocolsdeal with the data formatting, including the syntax of messages, character sets, sequencing ofmessages etc.

Protocol messages are typically structured in a hierarchical fashion, but the communication links on whichthey are transmitted are linear in nature. The protocol specification defines how such messages are encodedinto a linear stream, and decoded back into the original representation by the receiver.

Author’s address: K. Mitchell, Agilent Labs Scotland, South Queensferry, Scotland EH30 9TGc© 2009 Agilent Technologies

Agilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009, Pages 1–46.

“Teflon,” if used herein, means “fluoropolymer” or “PTFE.” Such use was improvident, as Teflon® is a registered trademark of E.I. du Pont de Nemours and Co. in the US and elsewhere used as a brand of fluoropolymer additive, coating or resin and not a finished product.

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

Aug.30, 2013 - Article is no longer restricted

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

simmerma

Typewritten Text

2 · Kevin Mitchell

Some protocols, such as SMTP[Klensin 2001] and SIP[Rosenberg et al. 2002], are text-based. They aredistinguished by being human-readable, and in many cases free-format in the sense that white-space is notsignificant. The task of decoding such protocols, at least at the syntactic level, is very similar to that ofparsing a conventional programming language. Existing compilation tools, such as YACC[Johnson 1979],Bison[Donnelly and Stallman 2006] and ANTLR[Parr 2007], can be readily adapted to the decoding task.Other protocols are XML-based, with SOAP[Gudgin et al. 2007] being a good example of this style. Manytools exist for manipulating XML documents, and these can be used for manipulating XML-based protocols.Where space is important, for example, when transmitting data across a link where you wish to conservebandwidth, a binary protocol is usually more appropriate. Such protocols attempt to compress the datainto a small number of bits, and are usually not human-readable without mechanical assistance. Bit-levelprotocols are widespread, ranging from the various IP stacks used in the Internet to the commonly usedencodings for ASN.1-based protocols. The APE is intended to address the problem of describing, encodingand decoding bit-level protocols.

2. BACKGROUND

In the beginning protocol decoders were typically written by hand. Conceptually this is quite a simple task,but it is time-consuming and error-prone. Over time people developed support libraries to assist in this task.The stucture of the code naturally mimicked the structure of the protocol being decoded, and it quicklybecame apparent that this task could be mechanized if the protocols were specified in a machine-readablefashion. Over the years Agilent has developed a number of such tools including PDL, PAL[Protocol ToolsTeam 2002], Honda[Long 2002], PDU Builder and MSG Util. Given this long list of alternatives, whydevelop yet another language? The majority of the examples cited are procedural in nature. They describehow to decode a protocol, rather than simply describing the abstract structure of the protocol’s data units.This tends to lead to overly verbose specifications where the user is forced to specify details that could beinferred by a compiler. Furthermore, generating encoders from such specifications can also be problematic.One of the initial goals of the APE project was to develop a more declarative way of specifying protocols.This should enable the same specification to be used for both encoding and decoding. By constructing adomain-specific language for the task, coupled with a compiler incorporating techniques like type inferenceand generic dispatch, the resulting specifications should be more concise and flexible than their proceduralcounterparts.

Writing the protocol specification is only part of the story. Most people develop programs using interactivedevelopment environments. These IDEs provide a number of advantages over traditional command-linetools. These advantages include on-the-fly parsing to detect syntax errors, language-aware editors thatperform code-folding and syntax highlighting, powerful navigator windows to quickly navigate around thedifferent declarations within a file, and between files, language-specific wizards to automatcally generatecode, integrated debuggers, with source-level debugging, and so on. Unfortunately, when developing protocolspecifications many of the existing tools require the users to go back to the old command-line style of programdevelopment. All the advantages of using an IDE are equally applicable to the task of protocol development;we just need an IDE that supports this task.

Traditional tools typically adopt a one-size-fits-all approach to decoding. Their decoding actions arelargely independent of the intended use of the decoder. But decoding a stream of Ethernet packets todetermine the mix of protocols encapsulated within these packets potentially involves a lot less effort thandecoding the same packet stream to display in something like a WireShark browser[Lamping et al. 2008].Another goal of the APE project was to develop techniques to allow the decoders produced by the compilerAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted

The Agilent Protocol Encoder (APE) · 3

to be tailored to the task at hand.

3. LANGUAGE

In this report we give the reader a flavour of the APE language that should be sufficient to understand themain ideas behind the design. A more detailed description of the language can be found on the main APEweb site[Mitchell and Kirkham 2009].

3.1 PacketTypes

The original inspiration for the APE language was the work done on PacketTypes by McCann and Chandraat Bell Labs in 1999-2000[McCann and Chandra 2000]. Their language was both expressive and concise,althought this conciseness often led to ambiguities within the specification. The language had the followingsalient features:

• Packet descriptions are expressed as types• The fundamental operation on packets is checking their membership in a type.• Layering of protocols is expressed as successive specialization on types.• Refinement of types creates new types, a facility useful for packet classification.

The starting point for the PacketTypes work was the C programming language. The hand-written decodersthat this work intended to replace were written in C, and so the typical protocol implementor was alreadyfamiliar with this language. Building on the notation for C structures was therefore a natural first step.However, the type system of a language such as C is not expressive enough to capture all of the structuraldependencies in a typical bit-level PDU. For example, in the case of an IP PDU we might start by defining

nybble := bit[4];

short := bit[16];

long := bit[32];

IP_PDU := {

nybble version;

nybble ihl;

byte tos;

short totallength;

...

byte protocol;

short cksum;

long src;

long dest;

ip_option options[];

bytestring payload;

} ...

This type can be interpreted as imposing a structure on packets, but without additional constraints it allowsmany sequences of bits that are not valid IP packets. The necessary constraints appear in a where clausefollowing the sequence, as in:

IP_PDU := {

...

} where {

version#value = 0x04;

options#numbytes = ihl#value * 4 - 20;

payload#numbytes = totallength#value - ihl#value * 4;

}

Agilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.

4 · Kevin Mitchell

The overlay...with refinement constraint allows us to merge two type specifications by embedding onewithin the other, as is done when one protocol is encapsulated within another, lower-layer protocol. Forexample, the specification of the encapsulation of a User Datagram Protocol (UDP) frame within an IPpacket may be specified as:

UDPinIP :> IP_PDU where {

protocol#value = 17;

overlay payload with UDP_PDU;

}

assuming that the UDP_PDU type was appropriately specified elsewhere.The use of constraints leads to natural and concise specifications in many cases. However, the language

is also very ambiguous[Mitchell 2002]. One of our initial goals was to remedy some of the deficiencies of thelanguage, whilst retaining the flavour of this approach.

3.2 APE Packet Types

The philosophy underpinning the design of the APE language is similar to that introducted in PacketTypes,although the syntax is less influenced by the C language. So in the APE language the previous examplewould be written as follows.

IPv4 =

( header: (

version: UInt4<4>

ihl: UInt4

tos: Type_Of_Service

totallength: UInt16

...

protocol: UInt8 as IP_Protocol

cksum: Bit[16]

src: Bit[32]

dest: Bit[32]

options: IP_Option[]

) where num32 == ihl

payload: IP_Payload<header.protocol>

) where numBytes == header.totallength;

The details of this definition are not important at this point; we are simply trying to give the reader a flavourof the language. Just as in PacketTypes, protocols are described by types where the structure of each typemimics the structure of the PDU it is describing, and where constraints are used to resolve ambiguities. Thesimplest packet type is Bit which matches a single bit in the PDU. Compound types can be constructedusing sequential composition (indicated by juxtaposition), alternation, and iteration. In this respect thesyntax has a lot in common with regular expressions. But types can be named, and referenced from othertypes, allowing recursive types to be constructed. Subcomponents can also be named, and then referred toelsewhere within the type.

3.3 Constraints

Each type reference has a number of attributes associated with it. Given a type reference T the expressionT#attr denotes the value of the attr attribute of this reference. Type references are matched against bitsequences, and so the most frequently used attribute is size which indicates the number of bits that matched,or should match, against the reference. A constraint can restrict the value of this attribute to a ranges ofvalues, most typically a single integer, or can use the value of this attribute to constrain other parts ofthe specification. In our example the (user-defined) expression numBytes is interpreted as an abbreviationfor numBytes(this) which, in turn, is interpreted as idiv(this#size, 8). When used in the expressionnumBytes == header.totallength this has the effect of constraining the type to which the constraint isAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


attached to have a size which is an integral number of bytes, whose length (in bytes) is equal to the valueof the totallength field. This example also introduces another attribute, value. By default, the valueattribute for a type reference is the bit sequence against which the reference was matched. However, theuser can also specify an explicit value, as in

UInt8 = Bit[8] where value is unsigned;

Such a type matches eight bits, but the value of a reference to such a type is the unsigned integer representedby these bits. Other constraints include

address. the bit address of the start of the type reference in the PDU.count. the number of iterations in an iteration type referencechoice. the index of the alternation chosen in an alternation typeremaining. the number of bits remaining in the current context at the start of the type reference.

One way of viewing these constraints and attributes is as an attribute grammar[Knuth 1990]. We attemptto find a derivation tree from the grammar such that the fringe of the tree matches the bit sequenceconstituting the PDU being matched, the attributes of all nodes are defined by their positions within thetree, and where all the constraints over these attributes are satisfied. Whilst such a view encapsulates whatit means for a bit sequence to be accepted by such a specification, it doesn’t help us find valid derivation treesin an efficient fashion. The challenge, from a compilation viewpoint, is to find a deterministic decodingstrategy that will accept all bit sequences that satisfy the specification, and reject all those that don’t.

3.4 Generic Types

Consider trying to specify the IPv4 protocol. The type of the payload field clearly depends on the value ofthe protocol field. But this mapping from value to type has to be extensible to allow new protocols to beadded later. Ideally we should not have to alter the file containing the definition of the IPv4 protocol justbecause we want to handle a new IPv4-encapsulated protocol.

The approach adopted by the PacketTypes language was to use overlays. In our view that has a couple ofdisadvantages. First, in the original definition of the IP_PDU type there is no indication that the protocoland payload fields are linked in any way. It is only when the reader encounters the UDPinIP type that thislinkage becomes apparent. In addition, the overlay mechanism can be viewed as decoding a bit sequencetwice, the first time as a simple bit sequence, and then subsequently as a UDP frame, or whatever typeis imposed by the overlay. As we will discover later, the APE allows user-defined actions to be embeddedwithin our packet types. If we allowed some form of overlay mechanism it would significantly complicatethe definition of what actions get executed during a decode, and in what order.

In the design of the APE we decided to take our inspiration from the design of the multimethod-dispatchmechanism in languages such as Dylan[Shalit 1996] and Cecil[Chambers 1992]. We believe our approachmakes the relationship between value and type clear, and naturally supports extensible mappings, both atlinking and runtime. We start by defining

generic IP_Payload<protocol: Int>;

This declares IP_Payload as a generic type, i.e. a type whose definition depends on the value of theargument, an integer in this case. We then define

IPv4 =

( header:

( ...

protocol: UInt8

...

) where ...


) where ...;


6 · Kevin Mitchell

In this specification the structure of the payload field is constrained by the value of the protocol field.However, the mapping from integers to payload types is still not specified at this point. To define thismapping the language allows us to declare specializations of a generic type. These specializations constrainthe argument(s) of the generic to subtypes of the original argument type(s), and then provide a definitionof the type to associate with these arguments. We view primitive types such as Int as sets of values, andthen treat subtypes as subsets, and constants as singleton sets. With this interpretation we can now define

IP_Payload< 6> = TCP;

IP_Payload<17> = UDP;

and so on. Here we view 6 as an abbreviation for the set {6} which is a subset, and hence subtype, of theset/type Int. Although fairly meaningless in this particular example, we could also define specializationssuch as

IP_Payload<1000..2000> = T;

This would be used whenever the protocol value was in the range 1000..2000 and there were no otherspecializations that were more specific.

3.4.1 Most-specific specializers. Clearly for such an approach to be deterministic our specifications must bewritten in a way that for every possible argument to the generic type there must be a unique most-specificspecialization. Therefore if we defined

IP_Payload<30..50> = T1;


then we would have an ambiguity unless there was also a specialization such as


to deal with the overlap between the two ranges. If the protocol field had the value 41 then we wouldview the payload as having type T3. Why? Because 40.50 is a subtype, when types are viewed as sets, ofboth 30..50 and 40..60, and so is more specific than either of them. In the most common case where thespecializations are singleton values then it is easy to ensure unique most-specific specializers. But in morecomplex cases, involving ranges and multiple arguments, it can be more challenging to ensure uniqueness.

The APE language supports a simple module structure. Types are defined within the context of a module.An import declaration allows modules to be imported into other modules, and the corresponding exportdeclaration allows selected types to be visible outside the module. The development environment compileseach module independently. However, a decoder is generated for a project, which contains a set of modules.The specializations for each generic type can be spread across multiple modules within the project. Thedispatching code required to support each generic type can therefore not be generated on a per-modulebasis. The compiler invokes a linker phase after all the modules in a project have been compiled. Thelinker is reponsible for analysing all the available specializations for a generic, determining the most-specificspecializations, and generating dispatching code for the generic type.

There is one further complication that should be noted at this point. In some cases the mapping fromvalues to types may need to be modified at run-time. A good example is where dynamic port-mapping isused. An exchange of signalling messages may establish a relationship between an IP port and a particularprotocol. To support such dynamic behaviour the generic dispatch mechanism needs to allow the runtimesystem to alter its behaviour. A simple strategy is to attach a map from values to types (or pointers totheir runtime representations) to the dispatching code. The map is checked for entries first, and if no entryis found then the statically-compiled dispatching code is then executed as before. To avoid this overheadwhen there is no need to support dynamic modifications the generic type must be tagged with a dynamickeyword modifier when such behaviour is required.Agilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


3.5 Extensible Enumerations

In our previous example the value of the protocol field was an integer in the range 0..255, and the generictype IP_Payload also took an integer as argument. This is perhaps not as clear as we would like. A neaterapproach might be to define an enumeration type that allowed us to use symbolic names instead of constantssuch as 6 and 17 in our specifications. The protocol field could then be defined to provide values of thistype, and the generic could be specialized on this type.

The basic idea of the enumeration mechanism is similar to the enumerations in languages such as C++and Java, i.e. it defines a type/set with named values. However, where it differs is that in some cases we wantthe set of named values to be extensible. So, for example, in the IPv4 module we define an IP_Protocolenumeration. We could, at that point, specify all the possible values of this enumeration. But this wouldembed into this module some information about lots of other protocols, i.e. their IP protocol number. Nowin some cases this might be want you want. For example, it is natural to define all values for the IP serviceprecedence field within the IPv4 module, in which case you could define the type as1

datatype IP_Precedence = {

routine = 0,

priority = 1,

immediate = 2,

flash = 3,

flash_override = 4,

critic_ecp = 5,

internetwork_control = 6,

network_control = 7

};

Type_Of_Service = (

precedence: UInt<3> as IP_Precedence

delay: ...

...

);

This approach works well when all the elements of the enumeration can be specified at the point where thetype is first introduced. However, in the case of the protocol field an alternative approach might be to definethe type within the IPv4 module, but allow enumeration bindings to be added to it in other modules. Werefer to such declarations as extensible enumerations. We need a mechanism for defining which enumerationsare extensible, some way of constraining the range of permitted values, and a syntax for adding new bindingsto the enumeration. We deal with each of these in turn.

To indicate that an enumeration is extensible we simply attach an ellipsis to the end of the definition, asin

datatype IP_Protocol = {

ip=4,

...

};

In most cases we wish to constrain the permitted values of the enumeration, and we can use a constraint toperform this task.

datatype IP_Protocol = { ... } where in 0..255;

To add an additional binding to the enumeration we use the following syntax.

datatype IP_Protocol += { tcp = 6 };

1APE types form two distinct classes. Packet types are used to define the structure of protocol elements. The attribute values

range over a disjoint class of data types, including Int, Bool, and user-defined enumerations.


8 · Kevin Mitchell

Note that this definition can appear at the point where the TCP protocol is defined, for example in a TCPmodule. It does not have to be defined in the IPv4 module. When the compiler builds a decoder it collectstogether all the modules in the current project. Depending on the mix of modules chosen the set of bindingsfor each enumeration can potentially vary.

Now whether such an approach is preferable to the non-extensible version is partly a matter of taste. Doesthe knowledge that TCP is mapped to value 6 in the IPv4 protocol field “belong” to the IPv4 module, to theTCP module, or to an auxiliary TCP_over_IP module, for example? If a new protocol comes along that canalso be carried over IP, do we need to go back in and change the IPv4 module, or should we just be able toadd ourselves to this enumeration from the new module? We need to gain more experience of writing APEspecifications before coming to any firm conclusions about which approach works best in practice.

3.6 Higher-order types

Most APE types either take no parameters, or are parameterized on abstract types such as integers, Booleansor bit sequences. There are occasions, however, when we require a bit more flexibility. Consider the followingexamples:

TA = num: UInt8

values: A[num];

TB = num: UInt8

values: B[num];

TC = num: UInt8

values: C[num];

There is clearly a common pattern being repeated here. It might be clearer to define a single type parame-terized on the type being iterated over. We can write such a definition using a higher-order type, i.e. a typethat takes another type as parameter.

NV<T> = num: UInt8

values: T[num];

TA = NV<A>;

TB = NV<B>;

TC = NV<C>;

What if the type being passed as argument is also parameterized? The TLV type in the WiMAX specifica-tion[IEEE 2009] illustrates this need, and the APE syntax required to specify it.

WiMaxTLV< WiMaxTLV_Value<t:Int> > = (

type: UInt8

len: WiMaxTLV_Length

val: WiMaxTLV_Value<type> where size == (len * 8)

);

We can parameterize the WiMaxTLV type on any type that takes an integer as argument.

3.7 Coercions

In the APE we can coerce between different types using the e as T syntax, for example to explicitly coercefrom an Int value to an enumeration type. Normally the type T is a datatype. But we can also makesense of such a coercion in the case where e is a value of type Bits, and T is a packet type. Indeed youcan imagine the whole decoding process as starting with the execution of the expression e as T , where e isthe bit sequence representing the PDU and T is the root packet type to be used to decode the PDU. Theinteresting thing about this construct is that there is nothing stopping you evaluating such an expressionin the middle of another decode. Of course the challenge is then to define what such an expression does.Agilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


But let’s delay worrying about that for now, and just consider how we might use such an expression in aspecification.

Suppose we have a protocol consisting of a collection of blocks, where each block has some data andthen two “pointers” to other blocks. We assume the blocks can be in an arbitrary order, possibly with gapsbetween them, and with no distinguishing bit sequence at the start of each block. Furthermore, we assumethat the first block is at the start of the PDU. Decoding a packet containing an encapsulated TIFF file[Adobe1992] might require us to handle a layout similiar to this one. How can we express such a protocol in theAPE?

We define a new module, plus either import or define some basic types such as Byte and UInt8. We thenadd the following definitions to the module, where the type PDU is the root type.

Pointer<bs: Bits> =

(ptr: UInt8) where value = bs as T_at_offset<ptr, bs>;

T<bs: Bits> =

length : UInt8

payload : Byte[length]

left : Pointer<bs>

right : Pointer<bs>

;

T_at_offset<offset: Int, bs: Bits> =

prefix: Byte[offset]

t: T<bs>

suffix: Byte[]

;

PDU = (bs: Bit[]) where value = bs as T_at_offset<0, bs>;

What is going on here? We start by decoding the PDU as a sequence of bits. This decode will obviously bevery quick as we aren’t trying to impose any structure on the bit sequence. It will essentially just skip over allthe bits in the PDU in a single leap without processing them in any way. However, it also allows us to assignthe name bs to the PDU bit sequence. We then evaluate the expression bs as T_at_offset<0, bs>, whichwill result in the decode of this bit sequence against the type T_at_offset<0, bs> at some later point.

The type T_at_offset is intended to capture the structure of an instance of type T at some specified byteoffset in the PDU. It imposes the following structure on a bit sequence.

If the offset is 0, as it is in our case, the prefix will be empty, the type T will be matched against the startof the bit sequence, and the suffix will soak up any bits that remain after T has been decoded. The type Thas the following structure:


10 · Kevin Mitchell

The field left is of type Pointer. It matches against a byte, interprets this as an unsigned integer,and also evaluates an expression that will result in the decode of the original bit sequence against the typeT_at_offset<ptr, bs>. In other words we will decode another instance of T, but at a different offset inthe PDU to the original one. We could obviously decode a different type at this offset as well with a smallchange to the specification.

To summarize, the basic trick at the specification level is to use the bs as T expression to repeatedlyprocess a bit sequence to “follow” the pointers and decode each individual block in turn. At the level of thespecification the solution is reasonably concise. Ideally we would like to parameterize the types Pointerand T_at_offset on the type T so we can express this pattern once, and then reuse it in similar situationselsewhere. At present the APE compiler cannot cope with mutually-recursive types that contain typeparameters, and so we cannot do this.

Whilst, syntactically, we have a mechanism for describing ”non-linear” decoding problems, the challengeis to define the execution semantics of bs as T in a way that provides a satisfactory solution to the problemof building decoders from such a specification. Imagine a decode engine as having a queue of pending decodetasks, where a decode task consists of a bit sequence to be decoded, and a packet type to decode it against.We start by adding our PDU and root type as a new task and start the decoder engine running. When weencounter an expression such as bs as T we create a new task, add it to the queue, and then continue ourcurrent decode. When the current task is finished we pick the next task off the queue, if there is one, decodeit, and keep going until the queue is empty. To avoid loops we record which tasks have been completed,and skip a task if it is a duplicate of a previously queued task. “Duplicate” in this context means the samebit sequence, the same packet type, and identical arguments to this type. Each new task would result ina new decode tree being added to the browser at the top-level of the decode tree display, as described inSection 4. The result, in our example, would be a sequence of decodes of type T. The APE IDE usesproperties (Section 5.3) to hide selected fields in a decode tree. Using this ability we would hide the initialdecode to clean up the display, and also the prefix and suffix fields. But ideally we want to do better thanthis. What value should be returned from an expression such as bs as T? Suppose we introduce a newopaque datatype that represents a decode task ID of some form. When we add a new decode task to thequeue, as a result of the bs as T expression, we return the ID of this task as the value of this expression.Such values in the decode tree view could potentially be interpreted as hyperlinks, allowing you to navigatearound the structures that have been decoded.

There is one problem associated with this technique that we don’t currently have a good answer for,namely how would we construct encoders from such specifications? That challenge is left for future work.

3.8 Type-checking

As mentioned previously, the APE type world is split into packet types and abstract data types. APEexpressions are built from constants and packet type attributes. These attributes have types, and thesetypes are constrained to be abstract data types. In other words the language does not allow an attribute tohave a value whose type is an instance of a packet type. For those readers expecting the decode process toproduce some form of decode tree, with packet type fields having as value the corresponding decode tree, thisbehavior might be unexpected. However, to support action-directed decoding (see Section 5) such behavioris essential. The types of some attributes are fixed. For example, the size, count and choice attributeswill always have values of type Int. But the value attribute does not have such restrictions. By default thevalue will be the bit sequence that was matched against this type reference, and will have type Bits. But theuser can override this behavior by explicitly defining an expression for the value in a constraint. The type ofthis attribute can therefore be any datatype, either built-in or user-defined. The compiler uses unification-based type inference[Pierce 2004] to deduce the appropriate type for such attributes and the expressionsthat use them. The compiler also uses typing information to allow some expressions to be simplified. Forexample, an expression of function type is assumed to be implicitly applied to this, the field to which theconstraint containing this expression is attached. Similarly if an expression is deduced to have a packet type,for example a field reference, then the compiler converts this to a value attribute expression for this type.Such heuristics were illustrated in Section 3.3. The expression numBytes == header.totallength wouldAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


be expanded by the compiler to numBytes(this#value) == header.totallength#value for example. Thecompiler could also use type inference to deduce the types of the arguments to packet type declarations.However, at present we do not do so, preferring the clarity of explicitly typed parameters to the concisenessof implicitly typed ones.

Type-checking bit sequences presents a particular challenge. The type Bits is used for bit sequencesof arbitrary length. However, in many cases we can deduce the length of a bit sequence at compile time.Furthermore, during the decode process the representation used to hold a bit sequence may depend on thesize of the bit sequence, and the alignment in the PDU (Section 8.1). For example, a field that is alwayssixteen bits in length, and guaranteed to start on a byte boundary, can be simply represented by a pointer tothe data. In contrast, a field whose length or alignment in the PDU is unknown at compile time will requirea more complex representation. To capture such information the compiler introduces a family of specializedbit sequence types of the form Bitsn. For example, the type Bits5 is used for bit sequences whose lengthis guaranteed to be always five bits long.

The user does not have to use the more refined bits types in a specification. Instead the compiler adoptsa two-pass approach to type-checking bit sequences. In the first pass all bit sequences are assigned the typeBits. Once the compiler has propagated size information through-out the specification, and has deduced thesize constraints for each field, a second type-checking pass is performed. This allows the compiler to assignmore specific fixed-length bit types to some expressions, together with the appropriate coercion functionsbetween expressions. A good example of the need for such coercions is where the value of an alternation isrequired, and the values of the branches of the alternation are all bit sequences but of differing lengths. Therefined bit sequence typing information is then used by the various back-ends to generate the appropriaterepresentations for each bit sequence.

3.9 Alignment analysis

The compiler also performs an alignment analysis. By default the language assumes that a type can bematched against a bit sequence starting on an arbitrary bit alignment. Whilst this allows very flexibledecoders to be produced, they will not be particularly efficient, and in many cases this is overly general. Forexample, do we really want to decode an IPv4 frame that starts three bits into a byte? The language allowsthe user to specify alignment constraints on types, using a similar notation to that adopted by Click[Kohleret al. 2000]. Specifying the starting alignment for every type would be very tedious. Instead the userattaches alignment information to the ”top-level” types in a module, typically those that are exported,and the compiler then propagates this information through-out the specification. The refined bit sequencetyping, when used in conjunction with the alignment information, allows each back-end to choose the mostappropriate representation of each bit sequence it manipulates.

4. THE APE IDE

Gone are the days when a simple command-line compiler interface was all that the average developer wanted.Integrated development environments such as Visual Studio, Eclipse and NetBeans provide a whole suiteof tools to increase developer productivity. Unfortunately protocol specification tools have not kept pace.There is no fundamental justification for this, other than it being a much smaller market. Almost allthe facilities that help you develop mainstream programming language applications are also applicable toprotocol specification development. These include on-the-fly syntax-checking, navigation tools, language-aware editors, integrated source-level debuggers and so on.

One of the aims of the APE project was to explore ways of increasing productivity. Agilent spends a lot ofmoney employing people to write and maintain protocol specifications, predominantly in PAL and Honda.Anything we can do to speed up this process has the potential to save a lot of money. The APE languagewas designed to reduce a lot of the syntactic clutter and overspecification present in many specificationlanguages, pushing more of the work onto the compiler. But there is a limit to how much speed-up indevelopment time you can get by changing the language alone. Equally important, in our opinion, is thedevelopment environment in which you create these specifications. It was our intention to provide a moderndevelopment environment for the APE, with facilities similar to those encountered in conventional IDEsAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


Fig. 1. The APE IDE

such as Visual Studio and NetBeans.Writing an IDE from scratch is a large task. Indeed to begin with we started down this path, intending

to write a small and simple environment to support APE development. Unfortunately it quickly becameapparent that users would expect and demand far more features in the IDE than we had the resources toprovide. Many modern IDEs are extensible, allowing support for new languages to be quickly added. Butthis then raised the question of which IDE to support. Our intention was to provide a number of differentback-ends to the compiler, targetting different languages. So there was no single IDE we could assume allour users would be using. Furthermore, we envisaged the decoders produced by the APE as forming onlypart of a much bigger application. Switching backwards and forwards between two full-blown IDEs couldquickly become irritating and confusing to users. Fortunately there was a middle-ground option available.Some IDEs, including NetBeans and Eclipse, are built on top of a platform, a collection of modules providingthe core functionality of the IDE. Users are free to build their own applications on top of this platform,allowing them to decide the mix of features they require to support their application.

The APE IDE is built on top of the NetBeans Platform[Boudreau et al. 2007]. An example layout of theIDE is shown in Figure 1. The main pane in this view shows an APE module describing an Ethernet framebeing edited. The pane is tabbed, allowing the user to easily switch between modules. Furthermore, thepane can be ”undocked” from the main window into one or more auxiliary windows, a useful feature whenyou wish to exploit multiple displays. Indeed, thanks to the support provided by NetBeans Platform, thearrangement of all the panes within the IDE can be altered to support the user’s preferences.

The top-left pane show the list of opened projects, and the modules within these projects. It also shows thecurrently selected primary task and the back-end to use for generating code, in this example the DecodeTreetask and the APE VM back-end. The pane on the bottom-left of the window displays a navigator view ofthe currently selected module. It displays the types and values defined in the module, groups generic typesand their specializations together, and highlights those entries that are exported from the module. TheAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Fig. 2. The APE Debugger

toolbar at the bottom of this pane allows the entries to be sorted alphabetically or by source location, andcan also hide private declarations and actions.

As described in Section 3.9, the compiler propagates alignment information through-out the declarations.Alignment errors, where the user expects a type to be decoded starting on a particular boundary, but wherethe compiler fails to deduce this fact, can be hard to spot. The decoder will function correctly, albeitslower than expected. Types that are inferred to start on arbitrary bit boundaries are highlighted in thenavigator view, allowing the user to quickly identify problems, and add additional alignment constraintswhen appropriate. The entries for packet type declarations in the navigator view can be expanded tonavigate the internal structure, the fields and subfields, of the type. Selecting an entry in the navigator viewdisplays the properties associated with the entry, see Section 5.3, in the properties inspector shown on theright-hand side of the window.

In addition to providing the usual editing functions, including seach/replace and control key short-cuts,the source editor is aware of the APE syntax. The system performs on-the-fly parsing of the APE codewhenever the user pauses typing. Syntax errors are displayed in the editor panel, allowing errors to bequickly identified and fixed. Code folding and syntax colouring are also supported. When generating codefor the APE virtual machine the generated code can be displayed in the Target tab of the source view.

Developing a specification that compiles without errors is only part of the challenge. We must also checkthat the specification can decode our packets successfully, and imposes the expected structure on theseAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


packets. This is another area where the provision of a IDE can really help. The IDE contains an integratedsource-level debugger. The user can load a file containing a sequence of packets, and then run the decoderagainst each packet. Individual packets can be selected and the execution of the decoder stepped through,both at the source level and the virtual machine code level. The user can also set breakpoints both inthe source code and the VM code. Figure 2 illustrates the APE debugger in action. The top-right paneallows the user to select the packet to be debugged. The pane underneath this displays the decode treethat was produced when decoding the packet. The pane at the bottom-right allows the user to see where inthe PDU the decoder has currently reached, together with information about the current stack values andbreakpoints. In this example a breakpoint has been set on the padding field, and the decoder has haltedexecution at that point. The usual debugging commands (go, step over, step into) are available, and shownin the tool-bar at the top of the window.

One of the intentions of the APE, which we expand on in the next section, was to encourage greater sharingof specifications between different groups. The IDE provides a subversion-based interface to a specificationrepository[Pilato et al. 2008], allowing the user to browse the available specifications, and load them intotheir projects.

5. ACTION-DIRECTED DECODING

The purpose of decoding within the APE system is to execute actions. These are fragments of application-specific code that are interleaved with the fields defining a protocol specification. Conventional parsergenerators, such as YACC[Johnson 1979] and ANTLR[Parr 2007], adopt a similar approach. They allowactions to be embedded within a language grammar, and then arrange for these actions to be executedas part of the parsing process. By analogy, an APE decoder could construct a tree, representing thehierarchical structure of the PDU, and execute the actions as they were encountered during the decodingprocess. However, this is not the only possible execution strategy, or even the most desirable one. Tounderstand why we must examine the differences between text-based parsing and the decoding of “bit-level”communication protocols.

A parser must usually process all of the input text to construct a parse tree. Even where only parts ofthis tree may be required by an application, the nature of text-based input can make it hard to skip overportions of the input text that are of no relevance. In contrast, bit-level protocols freqently employ fixedlength fields, or type-length-value (TLV) encodings for the fields. In such a setting it becomes much easierto jump over subcomponents that are of no interest to the application. Furthemore, applications that onlyrequire a partial decode of packet data are surprisingly common. Packet filtering, classification and indexingare typical examples of such tasks.

If every decoder had to construct a complete decode tree, simply executing actions as they were encoun-tered during the construction process, then this would impose an unacceptable overhead on those applicationsthat required access to only a small subset of each PDU. For this reason the APE adopts a rather unusualapproach to protocol decoding. We take the view that the only purpose of performing a decode is to executethe actions embedded within a specification. If a part of the PDU does not need to be decoded to supportthe current action set then it should be skipped over. In the extreme case, where the specification containsno embedded actions, the packet may be skipped in it’s entirety. We use the term action-directed decodingto describe this approach to protocol decoding.

The action-directed approach can yield very different decoders from the same specification depending onthe actions embedded within it. For example, our IPv4 example might be augmented with a single actionthat uses the value of the protocol field to gather statistics about the relative frequencies of the differentprotocols carried by a link. An action-directed decoder for such an example could skip the analysis ofalmost all the fields within the IP packet. At the other extreme, a PDU browser that displayed the detailedstructure of a packet would require a full decode of the PDU. Such behaviour could be driven by actionsthat accessed all the fields within the PDU.

Although it might seem like action-directed decoding is simply a different way of viewing the decodingprocess, its influence permeates to the heart of the language design. To allow the “thinest” of decodes,where we only perform the minimum amount of work to support the actions, it is important to avoid theAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


construction of any form of decode tree. This, in turn, requires careful definition of field visibility within aspecification to avoid creating a situation where part of the decode has to create a partial decode tree justin case another part of the specification contains an action that needs to reference it later. If the languagedesign is too liberal then we may not be able to avoid creating such structures. If the design is too restrictivethen it may be hard, or even impossible, to express the necessary constraints to define some protocols ofinterest. The language design therefore requires a delicate balancing act between these two extremes. Onepoint perhaps deserves further clarification. Whilst the decoder itself is designed to avoid the constructionof a decode tree, this does not imply that an APE decoder cannot construct such a tree for a PDU. It simplyimplies that such construction must be performed by actions, perhaps mechanically generated, rather thanbeing an essential and implicit part of every decoder.

That different applications might require access to different parts of a PDU seems quite natural. But ifeach application had to take a copy of the protocol specification, and then augment it with application-specific actions, we would quickly encounter problems. Our assumption is that a protocol specificationwould be developed, and debugged, prior to the insertion of application-specific actions. But bugs maybe encountered later, and specifications natually evolve over time. Incorporating such changes in multipleapplication-specific versions of the protocol source would be both tedious and error-prone. Although weenvisage actions to be relatively small chunks of code, particularly when designed for real-time execution,they can quickly obscure the structure of the underlying protocol, further hindering the maintenance of theprotocol specification.

The APE addresses this problem by storing the actions and the specification in separate files. The compileris passed a protocol specification and the application for which we require a customized decoder. Thecompiler takes care of merging the specification and the actions for the selected application at compile time.The APE development environment (IDE) allows the user to edit actions within the context of the mergeddocument. However, when such a hybrid document is saved the actions are split and stored indepenentlyof the specification. Furthermore, the editor can mark the specification portion of the document as beingread-only, avoiding accidental changes to a shared specification during the application development process.The merging process is illustrated in Figure 3, which also emphasises how the resulting decoder can vary insize and speed depending on the selected action set. The view of the merged document, as it appears withinthe APE IDE, can be seen in Figure 4. The grey portions of text in this example are read-only, showingthat only the code for the actions is editable.

5.1 Tasks

As noted previously, there are potentially many different decoders that could be constructed from the sameprotocol specification, each one differing in the amount of detail that is extracted from the same PDU.We could take the view that there is a set of actions associated with each application, but this course-grained approach will prevent valuable opportunities for sharing. For example, an application might requirepackets to be filtered before further processing. But a task like filtering may be common amongst multipleapplications. We therefore associate actions with tasks, and an application then uses one or more tasksto build a customized decoder. Tasks may depend on other tasks, forming a directed acyclic dependencystructure. This allows us to construct tasks whose purpose is simply to name, or group, another collection oftasks. This grouping allows us, without loss of generality, to specify a single task when building a decoder.We refer to the task we use for this purpose as the primary task. We can view the primary task as analogousto an application, but in practice the decoder + actions is likely to form only a small part of the application.

Clearly actions are expected to be executable fragments of code. But what language should these fragmentsbe written in? To answer this question we must first understand how the decoders themselves will beimplemented. The APE development system supports multiple back-ends. By this we mean that we areable to generate decoders for a variety of different languages. We might generate code for a low-level languagelike C, or for an existing specification language such as PAL. By default the APE implementation generatescode for an abstract machine, the APE VM. This is similar in spirit to the Java Virtual Machine[Lindholmand Yellin 1996], but specialized for the task of protocol decoding. The APE debugger uses this virtualmachine to build a source-level debugger, a crucial component when developing a new specification. WhilstAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


Spec

ifica

tion

APE Compiler

Fat decode,e.g. protocol trace

Thin decode,e.g. protocol pie chart

Task Aaction set

Task Baction set

A B

Specification Repository

Fig. 3. Fat and thin decodes from the same specification.

a software implementation of this VM is naturally slow, at least when compared to a decoder written in C,we have also developed an implementation in firmware[Kirkham 2009]. The current APE implementationcan also generate decoders written in C++.

Just as there are multiple different back-ends for the APE, the intention is that there will also be multiplelanguages the actions can be written in. For example, when using a C back-end it may be natural toembed actions written in C within the specification. These actions would then be interleaved with the codegenerated for the decoder. If the decoder takes the form of code for the APE VM then we could also writeactions in C. In this case, in a software implementation of the APE VM, we might use something akin toJava’s JNI interface[Liang 1999] to call the actions. In the case of an FPGA implementation of the VM, forexample in a Xilinx Virtex 4, we might implement the actions on the embedded PowerPC.

For small actions, and in most cases we expect the actions to be small particularly when decoding at highspeed, the overhead of calling between the APE VM processor and an auxiliary general-purpose CPU maybe unacceptably high. In such cases we might be better off writing the actions as VM code sequences. Thissituation is analogous to embedding C actions in a C back-end, except in this case we are embedding VMcode sequences representing actions into a VM code sequence performing the decoding. Of course it wouldbe unrealistic to expect the user to write such sequences directly, just as users rarely write in assembler anymore. However, we can imagine defining a variety of simple domain-specific languages, along with translatersinto APE VM code. Typical examples might be languages for packet filtering, populating flow-specific datastructures, and adding indexes to a database. Taken to extremes, it might even be useful to define actionsthat perform further decoding. For example, it might be possible to define a translator from PAL to APEVM instructions. Given such a translator it would be possible to embed PAL code, as actions, to performdecoding of fields that were treated as opaque bit sequences by the APE. Why might this be useful? Well, itwould allow languages such as PAL to exploit high-performance implementations of the APE VM. It mightalso allow the APE to support a wider range of protocols.

For an action to perform a useful task it must be able to access the current decoding context and retrieveAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Fig. 4. A merged document within the APE IDE.

the attributes of fields decoded within the scope of the action. Given the wide range of action languageswe would like to support it seems unreasonable to expect the APE compiler to understand the detailedstructure of each of them. Our approach is to embed simple metavariable expressions, such as $protocoland $payload#size, within the action text. Each occurrence represents a reference to part of the decoderstate, for example the value of a decoded field. The compiler searches for metavariable occurrences andreplaces each one by an expression that accesses these field attributes at runtime. The exact form of thereplacement expression will depend on the action language and the currently selected back-end. Clearly notall combinations of back-end and action languages will be supported, partly because of time constraints, andalso because some combinations are of little practical use. One of the properties of a task is the languagein which the actions associated with the task will be written in, and the compiler ensures that the selectedtasks are compatible with the chosen compiler back-end.

When prototyping applications using the VM back-end it would be tedious to have to keep writing domain-specific languages for all the different tasks we might wish to perform. And writing actions in a language likeC would create its own problems in a prototyping environment due to the requirement to write a separatefile with the C actions, load a C compiler to convert them to a DLL, and then load the DLL into the VMenvironment. An alternative approach is to use a scripting language, such as BeanShell[Niemeyer 2005], toAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


IPv4 =

( header: ( ...

ihl: UInt4

...

totallength: UInt16

...

protocol: UInt8 as IP_Protocol

« Hello:

System.out.println("Hello " + $protocol);

»...

) where numBytes = ihl*4


) where numBytes = header.totallength;

Fig. 5. A simple BeanShell action.

write our actions. Whilst clearly not practical for high-speed real-time action-directed decoding, it can beuseful for offline analysis of packet data. Using BeanShell actions we can write a simple “Hello Protocol”example, shown in Figure 5, that illustrates the use of actions and metavariables.

5.2 Thin and Fat Decodes

The “Hello Protocol” application was an example of a thin decode. In theory we just needed to access asingle field, at a fixed offset, to be able to execute the action. In practice we may need to do a bit morework. For example, the IPv4 PDU would typically be carried in the payload of an Ethernet packet, and sowe would have to decode part of the Ethernet frame to determine the type of the payload, and to decode theIPv4 type if the Ethernet type field had the value 0x0800. Furthermore, without further hints, the compilercannot guarantee there won’t be any further actions embedded within the IP payload field. So the decodermay need to perform additional steps to explore this possibility. The exact behaviour of the decoder istherefore not quite as simple as we might at first think. But we can safely assume that many of the fieldswithin the IPv4 header would be skipped over by such a decoder. Another rather more plausible example ofa thin decode is a packet filter where we examine one or more fields within the PDU and reject the packetif certain criteria are not met.

In a fat decode all, or most, of the fields need to be decoded. An example might be conformance testingwhere we wish to check that all of the packet fields are well-formed. An Ethereal-style packet browser alsorequires a full-decode of a packet, although in this case it can be done lazily. Some applications may requiremultiple decoders to be constructed. For example, we might use a packet filter to reduce the incomingpacket rate to an acceptable level, and then couple this to an offline decoder that performs a more detailedinspection of the remaining packet data. Our case study in Section 6 is an example of such an approach.

As mentioned earlier, the APE IDE encourages the clean separation of specification development fromapplication-specific task development. The initial “Core” task does not allow actions to be inserted. Whenthis task is selected as the primary task the user is allowed to edit all parts of the specification. The intentionis that the user will start in this mode, developing the specification and checking it’s correctness using theAPE browser and debugger. Once the user is confident the specification is correct the user can then startdeveloping application-specific tasks. The tasks editor is shown in Figure 6. When the user selects a newtask as their primary task then the editor switches to a mode where the specification part of the documentbecomes read-only; only the text of the actions can be edited. The intention is to allow the same specificationto be shared between multiple application task developers without any danger of the shared specificationbeing inadvertantly altered.

5.3 Properties

Actions drive the decoding process. However, this does not mean that these actions must be writtenexplicitly by the user. Indeed in many cases writing actions is a tedious task that could better be performedAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Fig. 6. Defining task properties

automatically. For example, imagine annotating a specification with actions to construct decode trees. Thestructure of these decode trees is based on the structure of the specifications themselves. A compiler cantherefore generate suitable class definitions to represent such trees, and then automatically insert actions togenerate instances of these classes. In practice, however, the result is unlikely to be very satisfactory. A usermay wish to suppress some fields from the decode tree as they contain no semantic value; their presence inthe PDU is simply to help drive the decoding process. Length and choice determinants are good examplesof such fields. A user may also wish to change the class names of the generated classes.

Fortunately the APE provides a simple but powerful mechanism to support such requirements and manyothers. A set of properties can be associated with each task. For predefined tasks these properties are alsopredefined. For user-defined tasks the user is able to define custom property sets. A property consists of

• a name,• a type,• a description,• a scope,• and a default value

The name, description and default value should be self-explanatory. The property type can be a primitivetype such as String, Integer or Boolean. The type can also be defined as a simple enumeration. The propertyeditor, visible in the right-hand pane of Figure 7, uses the type information to select the appropriate propertyvalue editor. For example, a property of Boolean type will display a check box that can be used to alterthe value. An enumeration will use a Combo Box editor to allow the user to select the required value fromamongst the pemitted values.

Some properties only make sense when attached to an individual field. Other properties are designed tobe attached to types, whilst others are intended to be attached to each module. The scope of a propertyindicates which contexts, module, declaration, or field, a property value can be attached to. Figure 6 showssome properties being defined for the Example task. When an element such as a module, declaration or fieldis selected in the navigator view, the corresponding properties are displayed in the property view, groupedby task. Tasks may depend on other tasks, and every task will (implicitly) depend on the Core task, so ingeneral there may be multiple groups of properties displayed in the property panel.

Some properties are predefined, namely those associated with predefined tasks. For example, the Core taskhas an alignment property defined on types that allows the user to specify the desired starting alignmentAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


Fig. 7. Properties for the PDB task

for the type (see Section 3.9). There is also a uri property to allow the user to provide a link to someexternal documentation relevant to the module, type or field. This will typically be a link to a section inthe formal specification of the protocol being defined. A menu attached to the navigator panel allows theuser to display such documentation if the URI is defined.

PAL specifications frequently start with a stylized header defining meta-data associated with the specifi-cation. Such information can be used to populate a protocol database. The APE defines a PDB task wheresuch information can be specified in a more natural interface, as illustrated in Figure 7.

Some properties are automatically defined. For example, the APE supports “JavaDoc” style commentsof the form /** ... */ and /// .... These comment strings are extracted from the source and madeavailable as description properties associated with the Documentation task. This task also defines a formatproperty, allowing the user fine-grained control of the display of each field. The intention is for propertyvalues to be available to actions, although this is not implemented in the current prototype.

6. AN EXAMPLE

By this stage the reader should have developed some familiarity with the APE language, the IDE thatsupports it, and the concepts of task and action-directed decoding. However, what might be less clear ishow it all fits together; the “bigger picture”. This section describes a small example that will hopefully helpto remedy this situation.

6.1 Preprocessing

The first phase of our example takes a stream of packets, and discards those that are of no interest to us. Theremaining packets are written to an HDF5 packet table[hdfgroup.org 2007b; 2007a]. To allow these packetsAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Filter Archive Index

Object Database

HDF5 Packet Store

Packet Stream

Fig. 8. The filter/archive/index task.

Object Database

HDF5 Packet Store

PacketSelector

PacketBrowser

Fig. 9. The fat task.

to be efficiently queried at a later time we also write some details of each packet to an object database.We will generate a thin decode for the component that performs the filtering and indexing, illustrated inFigure 8. Having archived the packet data we can use the database to query for packets of interest andperform more in-depth analysis of them. In our example the “in-depth” analysis simply consists of viewingthem in a packet browser. This process is illustrated in Figure 9. Note that the example illustrates the useof both thin and fat decodes.

We assume we already have specifications for Ethernet and IPv4; our two decoders will be driven fromthese specification, but use different tasks. We start our example with the design of the thin decoder. Wesplit the problem into a primary task, and three dependent tasks. Whilst unnecessarily complex for a simpleexample like this, it provides an opportunity to illustrate how larger tasks might be segmented into separateshareable subtasks.

Figure 10 illustrates how the Preprocessor task depends on the Filter, Archive, and Index tasks. Invokingthe APE compiler with Preprocessor specified as the primary task will merge in all the actions from thesesubtasks. In this example we will not associate any actions with the Preprocessor task itself. Its role issimply to group the other tasks. Although in our example we expect the archiving actions to be invokedafter the filtering has taken place we choose to not create a dependency between these tasks. After all, inanother application we may wish to archive an unfiltered packet stream. In this example we use the APEAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


Preprocess

Filter Archive Index

Fig. 10. The Preprocessor task.

VM as our build target with the actions written using BeanShell. A similar example could be constructedusing the C++ back-end, with actions written in C++.

6.2 Filtering

To keep the example simple we use a trivial filtering criterion. We only want UDP packets; anything elseshould be rejected. The APE BeanShell environment contains a reject method that can be used to haltdecoding of the current packet. Using this method our filter can be written as

...protocol: ...« Filter: if ($protocol != $IP_Protocol.udp) reject(); »...

6.3 Archiving

The next step in our processing is to write any packets that pass the filter to an HDF5 packet store. In thisexample we assume the environment that initiated the decoder has already established a suitable databaseconnection, and created a packet table. Our action simply has to add a selected portion of the decodedpacket to the table. The following action achieves this task.

...payload: ...« Archive: idx = writeToHDF5( $header..payload ); »...

The compiler will replace the metavariable $header..payload by an expression that, at runtime, will rep-resent the bit sequence from the start of the header field to the end of the payload field.

6.4 Indexing

A filtered stream of packets can now be written to a packet store, but how do we get them back again? Insome cases we may just want to access them sequentially. Examining them using an Ethereal-style offlinebrowser would be an example of this style of retrieval. In more interesting applications we may need tosearch for packets that meet certain criteria. To address this need we augment our application with a taskthat writes a precis of each packet to an Object Database, in this example db4o[Grehan 2006]. The exactform of the precis is unimportant. In practice it would depend on the nature of the queries we intendedto perform. For example, we might include a reference to the previous precis in a flow in applicationsthat required efficient traversal of packet flows. In our example we simply write the source and destinationaddresses, and the value of the protocol field, to the database. For such records to be useful we must alsorecord a link to the corresponding entry in the HDF5 packet table containing the packet from which thisprecis was constructed. The Archive task stores a packet ID as part of the HDF5 packet insertion action.The Index task can add this value to the precis. However, for this to be well-defined it is important thatAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


the packet is inserted into the packet table prior to writing the precis to the database. If the two actionsare attached to different fields then the field ordering can be used to ensure the actions are executed in thecorrect order. However, if we wish to attach the actions to the same field then we must record the fact thatthe Index task depends on the Archive task in the definition of this task. The compiler will then ensure theactions associated with these tasks are executed in an order that respects the dependency constraints. Thetask dependencies can be set in the Tasks Editor.

...payload: ...« Index: setIndex($header.protocol, $header.src, $header.dest, idx); »...

To complete the first part of the example we construct a primary task, Preprocess, that depends on thethree tasks we have just defined. Running the compiler with Preprocess selected as the primary task willresult in the construction of a decoder that filters, archives and indexes a packet stream. The resultingdecoder is optimized for the actions it has to support. The actions have no interest in the details of thepayload, the IP options, or most of the other fields within the IPv4 header. The decoder will therefore skipover these fields without attempting to interpret them. At present it can be hard to visualize what mustbe decoded, and what can be skipped, without examining the assembly code generated by the compiler, atask not to be recommended for the feint-hearted! Eventually we plan to add support for colour-coding aview of the original specification so we can quickly see what fields are directly used within the actions, whatfields must be decoded to determine the sizes and offsets of those fields, and the decoding of these may leadto further dependencies and so on. It would also be useful to colour-code the decoded PDUs in a similarfashion. In complex cases referring to a single field within an action may cause an avalanche of dependenciesthat result in the complete PDU having to be decoded. In practice such extreme behaviour is unlikely tooccur, but providing feedback on the relationship between actions and the resulting decoding optimizationsshould help avoid unexpected performance issues.

6.5 Analysis

We now consider the second part of our example. The APE development environment provides a browserfor displaying the structure of packets, as determined by an APE decoder. Figure 2 shows the browser inaction. The browser is similar in nature to an Ethereal-style browser, but it can also handle partial decodesproduced by thin decoders. In practice this just means the browser must be able to cope with bit sequencesrepresenting sequences of fields that have not been decoded. If we attached the browser to the decoderproduced by the Preprocessor task we would find that many of the fields, for example the payload andoptions fields, would not have a value in the displayed decode tree. Contrast this behaviour to an etherealdisplay, where every field that was present in the packet would be visible in the display.

What if we want to see all the fields within the PDU? The IDE contains a predefined task, FullDecode,whose sole purpose is to touch every field within the PDU. In other words it creates synthetic actions thataccess every attribute of every field within the PDU, which has the side-effect of forcing all the fields to bedecoded. We can build a decoder using this task, and then use the browser on this decoder to see a fulldecode of each packet. This decoder is an example of a fat decoder, in contrast to the thin decoder wegenerated for our preprocessor.

The browser can examine packets from a variety of sources. In addition to being able to examine conven-tional packet dump files it can also retrieve packets from an HDF5 packet table. We can therefore use thebrowser, coupled with our fat decoder, to examine the packets written to the packet table by our preproces-sor. However, there may be millions of such packets, although this is unlikely given the interpretive natureof our current VM implementation. Trying to find packets of interest within such a table by sequentialscanning is not very practical. Fortunately the IDE includes a BeanShell console window that can be usedto write and execute BeanShell expressions. Furthermore, the browser is scriptable, and so we can write ascript to retrieve the packet indices of all packets that meet a selected criterion from our db4o database.These indices can then be used to retrieve the raw packets from the HDF5 packet file for further decodingAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


and analysis. In our example the subsequent process simply consists of performing a full decode and viewingthe results in the browser. But it should be easy to see how we might extrapolate the same ideas to morerealistic applications, and more efficient implementations of these ideas.

Our example, whilst simple, illustrates many of the key ideas of the action-directed approach to decoding.BeanShell actions show great promise for offline exploratory browsing and packet processing. However,support for other action languages is also important as BeanShell is not a practical solution in a real-time orfirmware setting. The C++ back-end supports actions written in C++. It would also make sense to define afew small domain-specific action languages for common tasks such as filtering. This would allows the actionsto be embedded directly within the decoding code rather than having to be called from it. In addition, theuse of such languages allows actions using them to be easily retargetted at different back-ends. In some casesit may be more appropriate for a task to specify task-specific properties and then let the compiler convertthese settings into actions. A good example is the accumulation of field information within some form of“breakout” structure. Different tasks may wish to collect the values of different fields, and some tasks mayrequire access to the same fields. Using properties to generate the actions indirectly allows the requirementsof all the tasks to be merged without the individual tasks having to collaborate, or even be aware of eachother.

Although not obvious from our example, the decoder produced for the Preprocessor task performs slightlymore decoding than might be expected. The cause of this additional work is the generic type defining theIP packet’s payload. Modules such as TCP and UDP add their own specializations to this type, enablingthe decoding of the payload to vary depending on the value of the protocol field. To enable a scaleablesolution modules must be compiled without knowing all the potential specializations of generic types thatmight be added. So when compiling the IPv4 type we cannot simply assume that we can skip decodingthe payload. Other modules may define some additional specializations of the IP Payload type that haveembedded actions for our Preprocessor task, or one of its subtasks. To ensure we don’t accidentally skipexecuting such actions we must perform the generic dispatch on the payload type. Of course if no othermodules are loaded then the payload will be decoded as an uninterpreted sequence of bytes. And if othermodules do add specializations but contain no embedded actions for these tasks then they will also treatthe payload as an uninterpreted sequence. So the overhead is small compared to performing a full decode,but still larger than we might have expected. To avoid this additional overhead the compiler needs moreinformation. We could perform a “full-program” analysis of all the specification modules making up anapplication, allowing the compiler to deduce all the static specializations of each generic type. Even thisapproach is not guaranteed to be safe when modules can be loaded dynamically. Other approaches are alsopossible. For example, in the current implementation we can add additional constraints to a task, using theTasks Editor, to limit the modules the task can add actions to. In our example if the compiler could deducethat the Filter/Archive/Index/Preprocess tasks could only be added to the IPv4 module then it could safelyassume the generic dispatch could be optimized away.

7. HANDLING SIZE CONSTRAINTS

Communications protocols frequently contain fields of varying lengths, particularly when compact encodingsare required for efficiency reasons, or where the nature of the data being transmitted does not allow naturaland small upper bounds to be imposed on a field’s size. A frequent strategy, adopted in many protocols, isto use the Type/Length/Value (TLV) pattern to describe subfields. The Trip Message type in Figure 11illustrates this approach, and it is easy to see how the length can be determined before we start decodingthe context being constrained, the payload field in this case. In other cases, for example the IPv4 headeralso shown in Figure 11, the length field forms part of the context being constrained. Whereever we havevariable-sized fields it is natural to talk about the number of bits left to be consumed, and the decodingprocess will often be driven by such information. However, as should be clear from our initial examples,there may often be multiple size constraints in a protocol specification, and these may nest, as in the twosize constraints on the IPv4 type. This suggests that the notion of “bits remaining” should be interpretedrelative to some context, and this context will frequently be more refined, or specific, than simply the bitsmaking up the PDU as a whole.Agilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Trip_Message =

length: UInt16

type: Trip_Message_Type

payload: Trip_Payload<type>

where numBytes = (length - 3);

IPv4 =

( header: ( ...

ihl: UInt4

...

totallength: UInt16

...

protocol: UInt8

...

) where numBytes = ihl


) where numBytes = header.totallength;

Fig. 11. Size Constraint Examples

In some scenarios we may know the overall size of a packet prior to the start of the decoding process. Forexample, we may be reading packets from a dump file, or from a buffer provided by an interface card. Inother cases the length is not known in advance. A good example of this situation would be a packet decoderrunning in firmware, consuming bits as they arrived off the wire. Here we may not know the actual packetlength until we encounter a terminator of some form at the end of the bit sequence. Such details suggestit is important to distinguish between logical and physical size constraints. Knowing that, logically, thereshould be n bits remaining to be read does not always imply that, physically, these bits are available. Wherethe packet size is known, in advance, we may be able to blur these distinctions, and reduce the size checksrequired. In other cases we must be more circumspect, using the logical sizes as a guide, but not placingcomplete trust in them. Of course the primary reason for the logical view to diverge from the physical oneis due to errors, either because of a faulty PDU being generated in the first place, or transmission errors.Any mechanism for tracking field sizes must be able to properly detect and process a field or packet that islonger or shorter than its specified size.

In this section we describe a mechanism for tracking the sizes of fields within the APE. The approach,based on remainder stacks, handles nested constraints in a natural and efficient fashion. Field over/under-runs are caught early in the decoding process, and we suggest optimizations that minimize the number ofsize tests a decoder must perform.

7.1 The Remainder Stack

The remainder stack is constructed from a sequence of pairs of natural numbers. We use S to denote theset of remainder stacks, σ to range over elements of this set, and ε : S to denote the empty stack. Theconstructor • : (N× N)× S→ S is used to build non-empty elements of S.

In general a stack σ will have the form

pn • (pn−1 • (. . . (p1 • ε) . . .))

where n ≥ 0. We use #σ to denote n, the size of this stack, and σi to denote pi , the ith element of thestack. The top of the stack, pn can therefore be represented by the expression σ#σ. We also use the syntax〈pn , pn−1, . . . , p1〉 to represent such sequences, with the degenerate form 〈〉 providing an alternative syntaxfor the empty stack.

Each pair pi = (ri ,mi) in a stack contains two counts. The first of these, ri , is used to keep track of thebits left to be consumed in the context represented by pi . The second count, mi , records the maximum bitsize of this context. This suggests that each stack σ : S must satisfy the following invariants:

• ∀ 1 ≤ i ≤ #σ • mi ≥ ri ≥ 0• ∀ 1 < i ≤ #σ • mi ≤ ri−1

The first invariant simply states that the number of bits remaining to be consumed cannot exceed themaximum number of bits available for this context. Furthermore, we cannot consume more than this limit,and so ri cannot be negative. Contexts nest, and so the maximum number of bits available to a contextcannot exceed the remaining bits for the parent context. This is captured by the second invariant.Agilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


We define the following operations on remainder stacks.

Definition 7.1.

consume: N× S→ S.consume(n, (r ,m) • σ) = (r − n,m) • σ iff n ≤ r

new: S→ S.new((r ,m) • σ) = (r , r) • (r ,m) • σ

set: N× S→ S.set(n, (r ,m) • σ) = (n + r −m,n) • σ iff m − r ≤ n ≤ m

push: N× S→ S.push(n, (r ,m) • σ) = (n,n) • (r ,m) • σ iff n ≤ r

pop: S→ S.pop((r ,m) • (r ′,m ′) • σ) = (r ′ −m + r ,m ′) • σ

In a stack of nested contexts, each bit consumed should, at least logically, decrement the remainder count ineach stack entry. To avoid this expense, the operations are designed to only maintain an accurate remaindercount for the top entry in the stack. The definition of pop ensures that the remainder count in the newlyexposed top entry is updated to its correct value.

Theorem 7.2. The side-conditions on the operations are sufficient to preserve the stack invariants.

Whilst the general form of the pop operation is required for size constraints with upper bounds, wherean exact bound is known we expect the remainder to be 0 at the end of the context. For convenience weintroduce a specialization of the pop operation for this common case.

Definition 7.3.

popz: S→ S.popz((0,m) • (r ′,m ′) • σ) = (r ′ −m,m ′) • σ

This operation is a simple specialization of the pop operation, and so also preserves the stack invariants.

Theorem 7.4.

(1 ) new o9 set(n) ≡ push(n)

(2 ) push(n) o9 consume(n) o

9 pop() ≡ consume(n)

7.2 Examples

Our strategy for using remainder stacks is simple. A context is indicated in the source specification by a sizeconstraint on a field. The code generated to decode the field will start with a new instruction, establishinga fresh context. Consider the simple case of a constraint having the form . . . where numBits = exp. Theexpression exp may contain references to other fields, and so the value denoted by this expression can onlybe computed once all these fields have been decoded. We must therefore determine the point in the decodingprocess where the size becomes defined, and insert a set instruction at this point to update the entry toits correct value. Note that between the new and set points the remainder stack only contains an upperbound on the size of the context. Finally, the sequence for compiling the field will be terminated by a popzinstruction. In the simplest scenario the value of exp is known prior to the start of the context. Figure 12illustrates this case. Although not shown explicitly in this example, we assume the instructions generated todecode each field will call consume operations as the bit sequence is traversed; we will return to this pointlater. The instruction sequence shown in this example is non-optimal, as Theorem 7.4 allows us to replacethe new o

9 set sequence by a single push instruction.A more complicated situation occurs when the value of the expression is not known at the start of the

context. Figure 13 illustrates this scenario, and illustrates why, in general, we need separate new and setoperations.Agilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


P1 = a: T

( b: Bit[] ) where numBits = a;

“code for a” o9

new o9

set(value(a)) o9

“code for b” o9

popz o9

Fig. 12. An example where the size is known before the context starts

P2 = ( a: T

b: Bit[] ) where numBits = a;

new o9

“code for a” o9

set(value(a)) o9

“code for b” o9

popz o9

Fig. 13. An example where the size is determined within the context

P3 = ( h: (x: T1

y: T2

z: Bit[]) where numBits = y

b: Bit[] ) where numBits = h.x;

new o9

new o9

“code for x” o9

“code for y” o9

set(value(y)) o9

“code for z” o9

popz o9

set(value(x)) o9

“code for b” o9

popz o9

Fig. 14. An example where the size is determined within an inner context

In some cases the point where the size constraint first becomes defined may be inside another context.This situation is illustrated in Figure 14. There are two approaches we might adopt in such cases. We couldweaken the stack model, defining operations that updated counts within the body of the stack. This wouldbe messy, complicating many of the other stack operations. The alternative is to delay performing the setoperation until the context has been re-exposed at the top of the stack. This is the approach illustrated inthe figure, where the value of the outermost constraint is known as soon as field h.x has been decoded, butthe corresponding set operation is only emitted after we have finished decoding field h.

In the preceding examples we have focussed on equality constraints. If we simply have an upper boundon the size, for example a constraint of the form . . . where numBits <= exp, then the popz instructionswould be replaced by pop instructions. A popz instruction fails if the decoding process doesn’t consumethe expected number of bits, whereas the pop instruction simply ensures we don’t consume more than thisnumber of bits.

As mentioned previously, the instructions to decode each field must perform consume operations as thebits in the PDU are traversed. We can do this explicitly, by inserting consume instructions into the code.Or we can do this implicitly, where each instruction that advances the buffer pointer performs the actionsAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


required by the consume operation as well. The explicit approach has the advantage of allowing checks tobe combined. Consider Figure 14 again. If types T1 and T2 are of fixed sizes, n1 and n2 say, then in theexplicit approach these might produce two consume instructions, consume(n1) and consume(n2). Thefirst of these tests to see if there are n1 bits available for consumption, and the second tests to see if thereare n2 bits additionally available for consumption. Furthermore, the instructions that actually consume thebits, and move the buffer pointer, do not need to check the buffer size as they are “guarded” by the consumeinstructions.

We are free to move consume instructions to earlier points in the instruction sequence, as long as theinstructions they are moving past do not refer to the remainder stack. Furthemore, two adjacent consumeinstructions can be combined into a single one. So in our example we may frequently be able to simplifythe instruction sequence to generate a single consume(n1 + n2) instruction. This results in a smaller codesequence, and we only need to perform a single test, rather than two. Furthermore, if a packet is invalid thenthis fact is detected earlier in the decoding process. Of course we cannot remove all consume instructions,and so after the optimization process we will be left with consume instructions adjacent to non-checkinginstructions that advance the buffer pointer. Suppose we have the sequence consume(n) o

9 i(n), for someinstruction i . This is where the implicit approach becomes useful, because if we have a variant of i thatalso performs the actions of consume(n) then we can replace two instructions by the variant, potentiallyspeeding up the decoding process.

7.3 The Initial Stack

The operations, as currently presented, provide no mechanism for creating the initial stack. Both new andpush act on non-empty stacks, and so starting with the empty stack as our initial configuration would notwork. We create the initial stack in one of two ways, depending on the context in which the decoder isoperating. Frequently we know, in advance, the total packet size. For example, the packet may have beenbuffered in an interface card before being passed to the decoder. Suppose this size is N . In this case we startthe decoding process with a stack of the form 〈(N ,N )〉. As the remainder size of each context is propagatedas an upper bound to each child context, we can ensure that the decoder won’t read past the end of thisbuffer. At the end of the decoding process we would expect the stack to be in the state 〈(0,N )〉. If this isnot the case then it suggests the buffer is longer than expected, usually indicating a decoding error of someform.

There are situations where the size cannot be determined in advance, for example when decoding a packetas the bits are read off the wire. In such cases we start in the state 〈(M ,M )〉, where M is a number guaranteedto exceed the maximum size of the packet. The final state will then be of the form 〈(M ′,M )〉, where thepacket size is given by M −M ′. Of course in such a case we may not know whether this was the intendedpacket size. But if an expression within the packet indicates the packet size then we can do better than this.At the start of the packet we would generate a new instruction, as the packet starts a new context due tothe size constraint. This would produce a stack of the form 〈(M ,M ), (M ,M )〉. Once the field indicating theactual size, n, has been decoded the set instruction would result in the stack 〈(n ′,n), (M ,M )〉.where n ′ isthe number of bits remaining to be consumed. Finally, there would be a popz instruction at the end of thepacket, which would fail if we hadn’t consumed n bits.

Note that in this scenario the remainder stack is keeping track of the logical size constraints. But this isnot tied to the physical size, unlike in the first case. This implies an instruction to consume bits may stillfail even though a “guarding” consume instruction has succeeded. Although instructions in this settingmust proceed more cautiously, the implementation is more likely to be firmware-based at this level, allowingthese checks to be performed in parallel with the normal decoding process.

7.4 Failures

One of the primary reasons for implementing a remainder stack is to track field over-runs and under-runs.That is, we must ensure that the decoding instructions for a field consume no more or less bits than indicatedby the constraints on the field. Furthermore, the constraints may be either directly attached to the field, orindirectly imposed by the surrounding context. As mentioned previously, decoding when an overall size isAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


not known in advance requires explicit checks as we consume each bit (or perhaps chunk of bits). The casewhere a size is known is rather more interesting, as here we wish to minimize the number of size checks wemust perform whilst decoding the constituent fields. If the stack is initialized with an accurate size, thenthe stack operations, when deployed in the fashion outlined above, ensure that we cannot consume morebits than are available. Furthermore, the definition of popz ensures that we consume exactly the correctnumber of bits when an exact bound is known.

Let us consider Figure 14 again. At the point where we start decoding field x we do not know how manybits are allocated to this field. However, the sizes inherited from the enclosing context, via the new operation,ensure that we cannot exceed the buffer size without violating the side-condition on the consume operation.So either the code for x will succeed, or the (implicit or explicit) consume operation(s) associated with xwill fail, indicating a field over-run. Suppose the operation succeeds. We now reach the code for field y,where a similar situation applies. If the decode for field y succeeds we reach the first set instruction. Atthis point we may have already consumed more bits than are indicated by field y, and the side-condition onset would fail, indicating another field overflow. If the set operation succeeded, along with the decode of z,then we would reach the first popz instruction. If the remainder count was non-zero at this point it wouldindicate that the fields x, y and z consumed less bits than were required by the constraint, indicating a fieldunder-flow.

Continuing the example, when the second set instruction is reached we may find we have already consumedtoo many bits, generating another failure. Otherwise, we continue by decoding z and then check to see ifthe remainder count is 0, and triggering an under-run failure if not. This example illustrates how thepositioning of the stack operations in the instruction stream, along with the associated side-conditions onthese instructions, can be used to catch errors associated with field over and under-runs in a timely andefficient fashion.

7.5 Attributes

Fields in the APE have various attributes, such as their size and number of elements, that can be constrainedto guide the decoding process, or used in constraints for other fields. An important factor when decidingon how to decode a field will often be how much space remains for this field in the current context. Thissuggests that some notion of remaining space should be reflected back into the language, presumably asan additional attribute. The remaining attribute records/constrains the number of bits left in the currentcontext at the start of decoding the field. Such an attribute can be supported, at the compiled code level,by a remaining instruction, defined by

Definition 7.5.

remaining: S→ N.remaining((r ,m) • σ) = r

The value returned by remaining is unreliable when executed between a new and the corresponding set.It merely returns an upper bound on the size, as indicated by the enclosing context. Reflecting such a valuein a remaining attribute would be confusing. We should therefore treat such an attribute as being undefinedprior to the set instruction. There is potential here for a smart compiler to warn about such uses.

Where possible we would like to coalesce consume operations for efficiency reasons, by moving thempast other instructions. The remaining operation is an example of an instruction that would block suchmovement, as it examines the state of the remainder stack. Performing a consume instruction before theremaining will clearly generate a different result than executing it after the remaining. Fortunately thenumber of places where a remaining instruction is likely to appear should be relatively rare, and so shouldhave minimal impact on the consume optimization.

The APE VM maintains a separate remainder stack to make it easier to identify decoding problems.However, there is no need for a separate stack as the remainder stack can be embedded within the main callstack. The code generated by the C++ back-end, described in the next section, adopts this approach.Agilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


8. BUILD TARGETS

The current implementation of the APE compiler can generate decoders for two different target languages,the APE VM and C++. The APE VM is a simple stack-based virtual machine with an instruction setoptimized for the task of packet decoding. A companion report[Kirkham 2009] describes this machine, alongwith its implementation in software and firmware. In this section we describe some aspects of the C++back-end.

By default, APE decoders do not generate decode trees. Instead the compiler infers which attributes ofwhich fields are required to support the current set of actions. Of course we may need to decode other fieldsin order to determine where the fields of interest start, and their length. And these fields may, in turn,require other fields to be decoded. After this dependency analysis has completed we are left with a list ofrequired attributes for each field. The decoder generated by the C++ back-end produces code to constructthe values of each of these attributes, and then embeds the code for the user actions in the appopriate places.

8.1 Bit sequences

A decoder imposes a hierarchical structure on the bit sequence representing a PDU. At the leaves of this treeare subsequences of bits, or other values that have been constructed from such subsequences. The efficientrepresentation and manipulation of bit sequences is therefore an important aspect of the design of a decoder.Some bit sequences have lengths that are known at compile time, whilst others have a variable length. Somebit sequences have a known alignment, for example, being guaranteed to start on a byte offset. In other casesthe alignment may be unknown. Clearly using the same representation for a bit sequence of length 16 that isguaranteed to start on a byte boundary and a bit sequence of variable length and unknown alignment wouldlead to an inefficient representation for the first of these bit sequences. The C++ decoder runtime definesa variety of different bit sequence representations, and the C++ backend uses the information inferred bythe compiler for each bit sequence to choose the most appropriate representation to use.

8.2 Decode states

The decoding runtime uses a decode state object to record the current state of the decoding process. Thisobject includes (a pointer to) the PDU being processed, together with the current position within this PDU.Just as there are choices in how best to represent a bit sequence there is also a similar choice involvingthe tracking of the current position. If the decoder starts on a byte boundary, and consumes fields thatare an integral number of bytes in length, then the position can be represented by a simple byte pointer.However, we cannot always rely on this. We may encounter fields that have sizes less than a byte, or whoselength is not known and is not guaranteed to be an integral number of bytes. In such cases the positionafter decoding the field cannot be represented as a simple byte pointer, and we also need to record the bitoffset within the byte. The C++ decoder runtime defines both a BitDecodeState and a ByteDecodeState tocapture these choices. Typically the decoder will start decoding using a ByteDecodeState. At some point itwill reach a point where the inferred information about the next field(s) requires the runtime to switch tousing a BitDecodeState. Further along in the decoding process we may reach a field which is guaranteed tostart on a byte boundary, and be an integral number of bytes in length, and the decoder will then switchback to using the ByteDecodeState.

Unlike the APE VM, the C++ decoder does not maintain a separate remainder stack. We exploit thefact that the remainder stack frames nest within the call stack frames and keep track of the remainder stackdetails in local variables and the decode state. To avoid continually switching between ByteDecodeStatesand BitDecodeStates the compiler attempts to aggregate field accesses in some situations. For example,suppose we are currently using a ByteDecodeState and reach a point where we must decode fields a, b andc, with lengths 1, 2 and 5. We could convert to a BitDecodeState and then decode the three fields beforeconverting back to a ByteDecodeState. The remainder stack would also need to be updated after each field.However, a better strategy is to aggregate these fields into a“fictional”field, abc, of length 8 bits. This allowsus to avoid the use of the BitDecodeState, and we only have to perform a single update of the remainderstack. We can then unpack abc into the fields a, b and c without needing to update the remainder stack ordecode state.Agilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Packet types are implemented as C++ functions, and take a DecodeState instance as a parameter whichis modified during the decode to reflect the state when the method exits. The choice of BitDecodeStateor ByteDecodeState for this parameter therefore depends on not only the inferred starting alignment ofthe type, but also its overall size. For example, a type might be guaranteed to start on a byte boundary,but only be 3 bits long. Although the position in the PDU at the start of the type could be representedby a ByteDecodeState, the final position could not be, and so the function would take an instance ofBitDecodeState as argument.

The APE language allows packet types to be parameterized by other packet types. In the C++ decoderthis is implemented by (decoder)functions being passed as arguments to other functions. The choice ofrepresentation for the decode state arguments complicates this model somewhat. The compiler analyzes thecontext in which the “template” parameter is used, and then infers the appropriate typing for this parameter.However, this is not sufficient. For example, a template type T may be used in another type in a positionwhere the alignment is byte aligned, and the instance is constrained to an integral number of bytes. We mighttherefore deduce that the function corresponding to T will be parameterized on a ByteDecodeState. Supposenow that we attempt to supply a type S as the parameter, where this type takes a BitDecodeState. Withoutfurther actions the C++ compiler would detect this and reject the example. This would be unfortunate,as from the user’s perspective there is nothing wrong. The solution in such cases is to generate a wrapperfunction that takes a ByteDecodeState, converts it to a BitDecodeState, calls the function corresponding toS, and then updates the original ByteDecodeState with the contents of the BitDecodeState.

8.3 Scoping and namespaces

The scoping of fields within the APE does not map easily to the scope of names within C++. In somecases we have to model a field name by the path to the field from the root of the type to avoid ambiguities.However, if we did this everywhere then the field names would start to become very verbose. Whilst thisis not a problem for a compiler, it would undermine our goal of producing readable code. The compilertherefore only uses the path for a field name when necessary to avoid ambiguities. On some occasions thecompiler must declare the attributes for a field in the C++ code earlier than the field appears in the APEtype to ensure the values are still available for use by subsequent actions or constraints. Typical examplesare where actions refer to previously defined sequences or alternations.

The scoping of APE enumerations is another area that doesn’t align with C++ without a little ingenuity.Not surprisingly the compiler uses a different representation for extensible and non-extensible enumerations.It is tempting to use C++ enums for the non-extensible enumerations. Unfortunately the scoping of C++enums prevents the simple approach from working. For example, in the APE it is acceptable to have twodifferent enumeration types using the same name for a binding. A simple mapping to C++ enums wouldgenerate a compiler error in such cases. We therefore map enumerations to namespaces containing C++enums to keep the declaration spaces independent of each other.

8.4 Generic Dispatch

The APE language uses a generic dispatch mechanism to model the case where the structure of one fielddepends on the value of another field. The IPv4 payload field is a good example of such a type. We usea generic type to define this field, and then define a specialization for each different protocol that can beencapsulated as a payload in this type. This raises the question of what code should we generate for ageneric type. At an abstract level the function corresponding to the generic will examine the arguments tothe type and then attempt to find the most specific specialization of the type for these arguments and thencontinue decoding using this type. Unfortunately the specializations can be spread across multiple modules,and so at the point where we are compiling the generic type we do not have enough information to build thisfunction. One approach would be to perform a “whole program” compilation, where all the information willbe available. But this is unlikely to scale well. The solution adopted by the APE compiler is to generate a linkfile for each module containing information about the generic types contained in the module, together withdetails of any specializations defined in the module. This information includes the specialization argumentsand the name of the function that was generated for this specialization. Once all the changed modules haveAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


been recompiled the system runs a separate linker phase. The linker loads in all the link files, and then buildsthe dispatching code for each generic type it encounters. It then writes the results to another C++ file thatcan be compiled and linked with the rest of the application. The linker is responsible for ensuring that forall combinations of arguments there is a unique most-specific specialization. It also attempts to generateoptimal code for the dispatch. In many cases, particularly involving generics with multiple arguments, theremay be many ways of generating a decision tree. We can test the arguments in different orders, we canuse case statements or conditionals in some cases, and so on. The compiler generates code for a varierty ofdifferent alternatives, and then uses a cost function to choose a code sequence that minimizes this cost.

8.5 String templates

One of the challenges encountered when generating source code mechanically is to make it readable. Whilstwe do not expect users to modify this code they may want to inspect it to ensure their embedded actions areexpanded correctly. The more the generated code can look like the code in the rest of the project the easierthis task becomes. The compiler uses the StringTemplate library[Parr 2006] to assist in this process. Thelayout of the generated code can be easily customized by changing the templates used to generate it. Changesmade to the built-in templates will be shared between all users and tasks that use the application. Templatesmay also be overridden for individual tasks. This feature allows additional task-specific declarations to beadded to the classes generated by the compiler for each module.

8.6 The DecodeTree Task

The compiler can also generate classes to represent decode trees for each specification. This behavior istriggered by the selection of the DecodeTree task as the primary task. The intention is for the compilerto also generate actions to instantiate such trees although this is not currently implemented. If the userwishes to embed application-specific methods within such a class it makes more sense to use a derived classrather than modifying the String Templates used to generate the class. For example, suppose the compilergenerates a class C to represent the decode tree that will be constructed for instances of packet type C. Theuser can then define a class MyC, which inherits from C, and then add the support methods to the new class.Note that the new methods cannot be added directly to C as this class will be rebuilt whenever the decoderis constructed for this task. There is one further step we must follow for this approach to work. If othermechanically-generated classes embed instances of C then the code generated by the compiler to create suchinstances must use MyC, not C, to construct these instances. But how does the compiler know this? Weuse the properties mechanism. When the DecodeTree task is selected the properties predefined for this taskallow the user to override the class names to use, together with the base class from which all the decode treeclasses are derived.

Some fields in a type provide no semantic value; their sole purpose is to provide information to drivethe decoding process. Length and choice determinants are good examples of such fields. Once a decodetree has been constructed their role is over, and storing them explicitly in the decode tree simply duplicatesinformation. Furthermore, such duplication becomes a considerable hindrance when you wish to use the sametrees to drive a packet encoder. The eventual intention is to detect and suppress such fields from appearingin the decode tree automatically. At present this step has to be performed manually, using Boolean fieldproperties to indicate whether a field should be omitted from the decode tree.

9. PARTIAL DECODING

The APE compiler includes many heuristics that it uses to search for a deterministic decoder for eachspecification. Describing each one in turn would quickly exhaust the patience of the reader. Instead we pickone of the more complicated and novel ones to give a flavor of the challenges involved. This section is, byits very nature, rather more technical than the rest of the report. The material is not required by othersections in the report and so the reader can safely skip it if time is short.

Given two types, T1 and T2, representing two different structures, the combined type T = T1 | T2 canbe used to denote a choice between these structures. That is, a bit sequence matches against T if it matchesagainst either T1 or T2. To avoid the possibility of non-determinism, and a need for back-tracking, it mustAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


be possible to decide between these two alternatives prior to the start of decoding either of them. We referto this decision as a choice determinate. In many cases the inference task is simple. For example, we mightwrite

C = a: UInt4 // Four bits representing an unsigned integer

c: ( c1: T1 where a = 0

| c2: T2 where a > 0 );

In this example the field c consists of a choice between fields c1 and c2. The constraints on the individualconstituents of the choice lead naturally to the automatic construction of a choice determinate. Indeed inthis case there seems little to be gained by attaching constraints to the alternatives rather than writing anexplicit determinate in the first place. However, now consider the following example.

T1 = 0b0 a: Bit[7];

T2 = 0b1 b: Bit[15];

T = T1 | T2;

We allow literal bit sequences, such as 0b0 and 0xFF, to be used as types; such types match the bitsequences defined by the literals. The type T1 therefore matches any bit sequence starting with a 0 followedby an additional seven bits. Type T2 matches against any sixteen bit sequence that starts with a 1. TypeT is then built as an alternation of these types. The choice determinate for T must examine the first bit inthe sequence to decide whether to decode the bits using type T1 or T2.

A compiler could infer such a determinate automatically by calculating, for each type, those bits whosevalues are known and whose offset from the start of the type is also known. So, for example, for type T1we know that the first bit must be a 1. Similarly, in type T3, shown below, we know that the bit sequence1111 starts at offset eight in the type. In contrast, although in type T4 we know that 1111 appears as asubsequence, the offset of this subsequence is not known at compile time, as the length for field b can vary.

T3 = a: Bit[8]

0xF

c: Bit[4];

T4 = a: UInt4

b: Bit[a]

0xF;

We refer to a pair of an offset and a bit sequence as a probe. Given a type T, and a probe (o, b) associatedwith T, a decoder can probe the bit sequence starting at offset o; if the prefix of this bit sequence matchesb then we can conclude the type T may match at this point. We say “may” because there might be otherconstraints that prevent the match. A probe will therefore typically be used to rule out the possibility of atype matching, rather than guaranteeing a match. A type may contain multiple constant bit sequences atknown offsets, and so in general we can calculate a probe set for each type.

Suppose we calculate a probe set for each type in an alternation. If two or more alternatives have probesthat overlap, and the bit sequences in the overlap are not identical, we can use this information to choosebetween these alternatives. In the case of type T described earlier we see that T1 has the probe set {(0, 0)}and T2 has {(0, 1)}. Both sets contain a probe that examines bit 0, and the literals are disjoint. We canuse this information to build a choice determinate that tests the first bit to decide on which alternative tochoose.

Now consider a slightly more complex example, taken from an ASN.1/BER specification.

Octet = Bit[8];

ShortDefiniteLength =

( 0b0 short: UInt<7> ) where value := short;



LongDefiniteLength =

( 0b1 length: UInt<7> where < 127

long: Octet[length]

) where value := unsigned(long);

DefiniteLength = ShortDefiniteLength | LongDefiniteLength;

IndefiniteLength = 0b1 0b0000000;

V = ...;

TV =

( length: DefiniteLength

v: V where numBytes = length

) where value is v

| ( IndefiniteLength

v: V

0x0000

) where value is v

;

Before proceeding further, some additional explanation of the syntax used in this example is required. Thetype UInt<7> represents a field of seven bits representing an unsigned integer. Types can have values, and thevalue of a type is set using the value is . . . syntax. So, for example, the value of type ShortDefiniteLengthis set to the value of the short field within this type which, in turn, is set to the unsigned value of the bitsequence making up this field.

We start by considering the type DefiniteLength. This consists of two alternatives, and the techniqueoutlined previously can be used to infer a choice determinate for this type. But now consider the typeTV, which also consists of an alternation. We quickly discover that our simple probe-based approach doesnot work in this case. For the type IndefiniteLength we can construct the probe set {(0,10000000)}.But the type DefiniteLength has an empty probe set as it consists of an alternation where the alternativeshave no common probe set elements. This point perhaps requires further explanation. The constituentcomponents of the alternation making up DefiniteLength have non-empty probe sets; that is what allowsus to distinguish between these alternatives. But at this point we are trying to construct a probe set for thetype DefiniteLength itself. As this type is an alternation we must take the “intersection” of the probe setsfor the alternatives. Consider the following example:

A = 0b011 | 0b1111;

The probe set for the first alternative will be {(0,011)} and for the second {(0,1111)}. The probe set forthe alternation will then be the intersection of these probes, i.e. {(1, 11)}. This tells us that at offset 1we can guarantee to see the bit pattern 11 in any instance of type A, no matter what alternative is present.2

In the case of DefiniteLength we are not guaranteed to see any fixed bit pattern and so the probe set isempty. Unfortunately this fact prevents us from using our probe-based technique to resolve the ambiguitybetween the alternatives in type TV. We could give up at this point, requiring the user to specify an explicitchoice determinate. But can we be smarter?

In our example we have one alternative with a non-empty probe set, and one other alternative with anempty probe set. In general we may have the situation where we have more than one other alternative, andthe probe sets may not be empty, but are simply non-overlapping. By this we mean that either the probesrefer to different portions of the bit sequence, or else constrain any common bits to the same values. Clearlysuch bits can’t be used to discriminate between the choices. For simplicity we will deal with the case wherethere is just one other alternative; generalizing to more alternatives is straightforward.

2Actually it might be simpler to view the probes as having a single bit, not a bit sequence, so the notion of intersection is

simpler to define. Then just note that an optimization is to use bit sequences, splitting as necessary.

Agilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Our technique proceeds as follows. We define a partial bit sequence to be one where the value of the i th

bit, for any i , is either 0, 1, or ? (unknown). Any probe set can be used to build a partial bit sequence.Those bits at offsets specified in the probe set are set to their constrained values, and all other bits are leftundefined. Note that all partial bit sequences can be viewed as having an infinite length. In our examplethe type IndefiniteLength would generate the partial bit sequence

10000000????????....

The second alternative in type TV would produce this same partial bit sequence. We assume type V hasvariable length, and therefore the anonymous field 0x0000 is not at a fixed offset from the start of the type,and therefore does not contribute to the probe set.

Our approach relies on defining what it means to decode a partial bit sequence using a packet type. Ifwe can prove that decoding the first alternative in TV will always fail when using the partial bit sequenceassociated with the second alternative then we can use the corresponding probe set to discriminate betweenthem.

Decoding a bit sequence using a packet type can lead to failures. The bit sequence may be too short, aconstant bit field may not match, or a constraint may not hold, for example. Partial bit sequences introduceanother source of failures. Matching a literal bit sequence against a subsequence of a partial sequenceillustrates the problem. The subsequence may be completely defined, in which case the match may succeedor fail. But the subsequence may also have unknown bits, in which case the match will also fail. However,the two failures are rather different in nature. In the first case we may conclude the decode will always failat runtime. Whereas in the partial case the decode may fail, but may succeed, depending on the value ofthe unknown bits. We therefore distinguish between decode failures and partial failures. Decode failuresprovide useful information. If a type T, when matched against a partial bit sequence, produces a decodefailure then we can conclude that, no matter what instantiation of the unknown bits we choose, the type willfail to decode the resulting bit sequence. We can use such knowledge to construct our choice determinate.On the other hand, a partial failure provides no useful information. Because of this, when a partial failureoccurs we keep decoding subsequent fields, if possible, in the hope we will either reach the end of the type,or generate a decode failure.

This process is best illustrated by an example, the type TV introduced earlier. We have already ascertainedthat the first alternative has an empty probe set, and the second a non-empty one. The non-empty probe setyields the partial bit sequence 10000000????????... which we try to match against the first alternative, aninstance of DefiniteLength. Expanding this type reference we discover we need to resolve an alternation.Fortunately we can easily construct a choice determinate for this case, and this involves using the firstbit as a probe. The first bit of our partial bit sequence is well-defined, and so we can conclude we mustdecode the sequence using the type LongDefiniteLength. Unsurprisingly, the first bit matches, and thenthe subsequent seven bits are consumed as part of the length field. The field’s value, an unsigned integer,is well-defined, as all the bits making up this field have known values. We test to see if this value, 0, isless than 127. It is, and so we keep going. The next field is a sequence of octets of zero length, which istrivial to decode. We then compute the value of the LongDefiniteLength type, which involves convertingthe bit sequence matched against the field long as an unsigned integer. But this field didn’t consume anybits during the decode, and so this function call will result in a decoding failure. We can therefore concludethat it is safe to use the probe set associated with IndefiniteLength to choose between the alternativesfor TV.

In the previous example the type TV depended on the type DefiniteLength, but this dependency wasnot mutual. We could therefore compute the choice determinate for DefiniteLength prior to consideringthe type TV. The compiler topologically sorts the declarations using their dependency graph, and use thisordering when resolving determinates. In some cases we may have mutually recursive alternations wherewe may need to know the choice determinate whilst decoding to calculate the choice determinate! We cantreat such a situation as another source of partial failure. We shouldn’t treat this case as a decode failurebecause the final type definitions, assuming we can resolve the choice determinate in some other way, maybe well-formed, and any closed instantiation of the partial bit sequence may be successfully decoded againstAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


such types.Suppose we modified the example so that the value expression explicitly checked for the empty sequence

case.


( 0b1 length: UInt<7> where < 127

long: Octet[length]

) where value is (length > 0) ? unsigned(long) : 0;

In the modified example the partial decoding will succeed, as we only evaluate the function unsigned whenthe length is > 0. The compiler will therefore conclude that there is an ambiguity in the alternation withinTV. More precisely, the compiler will not be able to use the proposed technique to resolve the choice.

Although our original definition is unambiguous, the reason the length field in a LongDefiniteLengthcannot be zero is not immediately obvious. We could make this restriction more explicit by adding anadditional constraint, as in


( 0b1 length: UInt<7> where > 0, < 127

long: Octet[length]

) where value is ((length > 0) ? unsigned(long) : 0);

Our partial decode will now fail at the point where the constraints on length are checked, and so the choicedeterminate can again be constructed.

We now consider some examples that illustrate how the decoding process should continue for as long aspossible when partial failures occur. Consider the alternation in the following specification:

X = a: UInt<8> where > 0

b: UInt<8> where > 0

;

Y = 0x0

a: Bit[4]

0x00

;

Z = X | Y;

There are two probes for Y, and so we create the partial bit sequence

0000????00000000????????....

Our technique requires matching type X against this bit sequence. We start by decoding field a, whichmatches against the bit sequence 0000????. The value of field a is clearly undefined, as it is built from a bitsequence containing undefined bits. The success, or failure, of the constraint on this field is therefore alsoundefined. However, at this point we can still carry on decoding and so proceed to field b. This also matchesagainst 8 bits, 00000000. In this case the value is well defined, and the constraint on the field results in adecode failure.

Suppose we modify the example to

X = a: UInt<8> where > 0

b: Bit[a]

c: UInt<8> where > 0

;

In this type we cannot skip past the field b, as the size of this field depends on a, whose value is undefined.We must therefore be conservative, and assume the bit sequence could be decoded against X, for someAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


suitable substitution of the unknown bits. The compiler therefore cannot resolve the ambiguity in type Zfor this modified version of X.

Generalizing our technique to the case where there are more than two alternatives is simple. We pick analternative with a non-empty probe set, and then attempt to use partial decoding to distinguish this typefrom the other alternatives. If all the decodes result in partial failures then the technique is inapplicable,and we must find another way of inferring a choice determinate. However, if some of the decodes resultin decode failures then we can partition our original set of alternatives into two sets, those that producedecode failures, and those that do not plus the alternative we chose originally. We can then attempt tocompute choice determinates for each of these sets, this step is obviously trivial for singleton sets, and thencombine the results to form a choice determinate for the original alternation. If we get stuck then we pickone of the other alternatives with a non-empty probe set, and try again. If all the choices are exhausted weconclude the alternation is not amenable to our proposed technique. In that case we must either use someother technique, unspecified here, to resolve the ambiguity, or require the user, as a last resort, to explicitlyspecify the choice determinate.

10. PAL SUPPORT IN THE APE IDE

The APE IDE is built around the concept of a project, a collection of protocol specification files. It has beenextended to recognize directories containing collections of PAL files[Protocol Tools Team 2002]. At presentwe reuse the existing PAL build scripts, and so the locations of these directories are limited to those usedin the manual build process. Starting the IDE, and opening an existing PAL project (directory) results inthe following display:

The pane at the top-left simply displays the specification files associated with the current project. In thecase of a PAL project these will all be .PF or .PFI files. The navigator view shown in the bottom-left paneAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


Fig. 15. An example of on-the-fly syntax checking.

displays the logical structure of the selected file, i.e. the variables, the blocks, the fields within these blocks,and so on. The views are connected, in the sense that changing the file selection within the project viewwill change the view in the navigation pane.

The source view uses syntax highlighting to emphasise the structure of the source files. It also performson-the-fly syntax checking to help identify simple syntax errors early on in the specification process. Fig-ure 15 illustrates what happens when we introduce a simple syntax error, deleting the “S” at the end ofthe “CONTENTS” keyword. Given the structure of the PAL grammar it is not always possible to identify theprecise source of any error and so, as in this case, the phrase being highlighted simply gives an indication ofthe area in which the error has occurred. Fixing the error, by typing the missing character, will cause theerror annotation, and highlighting, to disappear without requiring any additional user interaction.

The source editor also supports code-folding. In the prototype support for PAL we have used this to allowapplication blocks to be collapsed so they don’t obscure the stucture of the blocks. Figure 16 illustrates thisprocess. Individual application blocks can be collapsed and expanded using the controls at the left-hand sideof the pane. The menus also allow all the application blocks to be expanded or contracted in a single action.Finally, hovering the mouse over a collapsed block causes the hidden text to be displayed temporarily, asillustrated in Figure 17.

Having developed a specification you then want to do something with it. Not being PAL developers, wecurrently only have a superficial understanding of the details of the PAL build process, and in particular howit is configured and tailored for different applications. The prototype IDE reflects this partial understanding,simply allowing a target environment to be selected using the project options menu, shown in Figure 18.Having set the build target an individual file can be compiled by using the right-mouse menu when in thesouce pane. The IDE initiates the appropriate build script and displays the results in the output pane, bydefault displayed at the bottom of the window. The IDE looks for any warnings and errors within thisoutput and annotates the source file(s) with the appropriate messages. Furthermore, the output is modifiedto include hyperlinks to the source locations indicated by the messages. Figure 19 illustrates this process.The whole project can be compiled using the Build menu.

10.1 Issues

Unlike the IDE support for the APE language, where custom code has been written from scratch, for thePAL prototype we used a tool called Schliemann[Jancura 2008] to construct the support for PAL. Thisenables us to produce a proof-of-concept very quickly. The downside is that it imposes a number of designconstraints on us that are difficult to work around. In particular, the parser it constructs is fairly weak andslow. The PAL grammar, at least in the form described in the language specification, is difficult to parse inits current form. The grammar we are currently using is still ambiguous in various areas. However, if theAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


Fig. 16. Code-folding application blocks.

prototype was developed further it may make more sense to build a custom parser for the language, ratherthan using the parser generated by the Schliemann framework.

The PAL specifications make use of the C preprocessor in some of the specifications. This causes problemsin an IDE. Even the simple task of syntax coloring can be confused by the presence of macros. But on-the-flyparsing has even more problems. In theory it might be possible to perform on-the-fly macro-expansion aspart of the behind-the-scenes parsing process. But even if we did this it would be difficult to see how toreflect some parsing errors back into the source view. Trying to analyze a source file containing macros,without macro-expanding, is also problematic. When a fragment of specification is encountered within amacro it is typically unclear what phrase in the language such a fragment represents, and therefore how tostart parsing it. The approach adopted by the prototype is to simply skip over macro definitions. Even thisis not sufficient, as calls to these macro definitions can be sprinkled throughout the specification. We haveextended the grammar to allow macro calls in the places where they are typically used within a specification,but it would be easy to write an obscure specification that made use of macros so extensively that the IDEgot confused.

Macros cause a particular problem for error messages. At present it appears that some error messagesreported by the PAL compiler indicate their location in terms of the original source file. But other errormessages use the location relative to the preprocessed version of the file. It is difficult for the IDE to mapsuch location information back to the original source position. Furthermore, the IDE, and presumably theuser, cannot tell which messages use which type of location information, so it is bound to get confused someof the time. This results in the IDE occasionally jumping to the wrong location when following a link, orannotating the wrong line in the source file. To improve this aspect of the IDE would require the PALcompiler to be fixed. The compiler also issues messages using a number of different patterns. In some casesthe file and line/column information are displayed, in others just the line number. In some cases the errormessage is on the line preceding the location information, and in other cases it follows this line. The IDEhas heuristics to try and guess the information it requires for the four or five different patterns we haveAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


Fig. 17. Auto-expanding application blocks.

Fig. 18. Configuring the build options.



Fig. 19. Message hyperlinks, and annotated source files.

encountered so far. But there may well be other patterns that would confuse the IDE. If the prototype wasto be developed further it would make sense to rationalize this aspect of the compiler to make it easier tointegrate with an IDE.

The most significant problem with the prototype is its performance. For small files it performs well.Unfortunately some PAL files are nearly 45,000 lines long. Performing on-the-fly parsing of such files isalways going to be problematic. But doing this with a slow parser, on the main GUI thread, brings thesystem to a halt for a few minutes, and makes it unusable. The final version of Schliemann will performthis task in a background thread, which will help a bit. But we may only get acceptable behaviour handlingsuch huge source files by using a faster parser, and perhaps switching off on-the-fly parsing when files exceeda size threshold.

11. FUTURE WORK

Whilst the APE project has covered a lot of ground there are still many areas requiring further exploration.In this section we briefly describe some of them.Agilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


11.1 Text-based protocols

The main focus of the APE project has been the support of binary protocols. These allow thin decodes tobe produced using the technique of action-directed decoding. However, there are other types of protocolcommonly in use that cannot be ignored. Text-based protocols are an important and increasingly usedmechanism for data transmission. Conventional parser generators such as ANTLR[Parr 2007] can generateefficient code to parse languages described by formal grammars, and the technology has been refined overmany years. Text-based protocols that use formal grammars to define their structure can be processedby such tools to generate efficient decoders. More recently there has been increasing use of XML-basedtechnologies, spawning a large number of tools to process such data. There seems little point in extendingthe APE to implement yet another conventional parser generator or XML parser. However, in many scenariosa text-based protocol will be embedded within a bit-level protocol, and so it may make sense for the APEto be aware of text-based protocols to some extent. We could provide wrappers around an existing parsergenerator, e.g. ANTLR, and such text-based APE modules could be edited within the APE IDE in asimilar fashion to existing bit-level APE modules. Compiling a text-based module would delegate, behindthe scenes, to the external parser generator, but the APE would take care of the glue necessary for therun-time to call between bit-level decoding and text-based parsing. A prototype of this approach has beenpartially implemented within the APE.

Most text-based protocols are far more constrained in their structures than the typical programminglanguage, and so in many cases they do not require the full power of a conventional parser generator todecode them. Another approach that might be capable of handling an interesting subset of text-basedprotocols is the work on Ragel[Thurston 2009]. The gap between such a language and the APE is smallerthan that between the APE and something like ANTLR, and so there is perhaps more scope for combiningthe best of both approaches without compromising the elegance of either of them.

11.2 ASN.1

Another important class of protocols are those written using the ASN.1 notation[Larmouth 1999]. EachASN.1 specification can generate multiple output formats depending on the encoding rules that are used,for example Basic Encoding Rules (BER) and Packet Encoding Rules (PER). Given an ASN.1 specification,and a choice of encoding rule, it is possible to generate an APE specification for this combination. We havedone this by hand for some simple examples. However, to produce a practical system we would need tobuild an ASN.1 compiler that targeted the APE or else add an additional back-end to an existing compilerto perform this task. Unfortunately the only open-source implementations of ASN.1 provide only partialsupport for the specifications defining mainstream ASN.1-based protocols. Those compilers that do fullysupport the ASN.1 language are commercial applications for which we have no access to the source. Thehope is that the open-source compilers will gradually mature to the point where they can cope with realASN.1-based protocols. If this happens then it may be more feasible to provide an ASN.1 front-end to theAPE system.

11.3 ChipScope

A traditional view of a protocol decoder assumes that the packets will either be read off the wire, frommemory, or from a disk. However, as systems become more complex the data sent across buses withina device becomes more complex and provides another potential source of packet data. Products such asChipScope[Intel 2009] allow the developer to extract and save signals from a bus within an FPGA design.In one of our examples we captured such data, generated a suitable APE specification to decode it, and thenused the APE Browser to examine the decoded data. A tighter integration of these tools could provide avaluable tool to firmware developers.

11.4 Other build products

The APE could potentially produce a large number of different build products. These products fall into twomain categories, dynamic and static. By dynamic we mean build products that are driven by the decodersgenerated by the APE. The simplest dynamic build products are probably filters. For example, we canAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


simply insert actions at various points in a specification that check predicates on one or more decoded fieldsand then either reject or accept the packet. Our HDF5 example illustrated this process. Clearly if an existingapplication used a two stage approach of online filtering followed by offline analysis of the filtered streamthen it would be reasonably easy to integrate the APE into such an application. Dynamic build products useactions to interact with the decoder. But these actions do not have to be written explicitly. For example, theAPE supports task-specific properties as well as actions. We could therefore imagine attaching propertiesto selected fields indicating that we want to filter on them. The compiler could then automatically generatethe appropriate actions along with a “driving” application that allowed patterns of acceptable field valuesto be specified for example. This choice between explicitly specified actions vs. implicitly generated oneswill probably turn out to be a common theme. Initially, for a new type of build product, we would writeexplicit actions to prototype the application. If the build product is likely to be used repeatedly, on lotsof different protocols, then it will probably make sense, at some later point, to automate the generation ofthese actions, where this generation may be driven simply by the task itself or by task-specific properties.Given we have BeanShell, a Java scripting language, embedded in the APE we could even imagine allowingthe users themselves to write scripts that generate actions.

Another common style of application is likely to be one where a decode tree is constructed for a PDU,and then application-specific methods are used to traverse this decode tree. The simplest form of such anapplication is probably a PDU browser, such as Wireshark[Lamping et al. 2008]. This application uses aplug-in architecture for adding new protocols, and an APE backend could potentially generate code for suchplugins.

Some applications require a decoder to produce a summary of fields of interest in some form of structure.In many cases it should be feasible to generate simple actions to populate elements of such structures. Indeedone might go one step further. You could imagine tagging various fields of interest, using properties, and thecompiler could then potentially determine the best structures to hold the values of all these fields, togetherwith the actions to populate these structures. Once again the best migration strategy seems to be to writeactions explicitly initially, and to then automate the process once the pattern becomes clear.

The build products we have been describing are all dynamic in the sense of producing, explicitly orimplicitly, actions that are called by the decoder (and are used to build the decoder using the action-directedphilosophy). But there are other build products that could be generated that are independent of APE actionsand the APE decoder. For example, PAL specifications often include metadata as comments at the start ofa specification. Such protocol database entries are more naturally stored as task-specific properties in theAPE, where we get the advantage of type-checking, pop-up menus etc. Field-specific documentation canalso be written using properties. It would be simple to use such properties to populate databases, generateinformational tool-tips in browsers etc. We already do some of this within the IDE. But clearly we couldalso populate application-specific data sinks in a similar fashion. Some applications may be driven by tablesindicating field offsets, lengths, types etc, where the offsets are relative to the start of the PDU, or relativeto some other field etc. A protocol-independent decoder could use such information to analyse packets,extract fields of interest etc. Given a description of the tables required, and the desired content of thesetables, it would be relatively straightforward to construct such tables from APE specifications, at least forthose protocols that are amenable to such an approach. A final example of a static build product wouldbe a backend that generated code suitable for input to existing tools such as PAL and Honda. This wouldclearly be the simplest way of integrating the APE into existing PAL or Honda-based applications.

11.5 Encoding

In the case of decoding we start with a bit sequence and create a hierarchical structure that matches thissequence. Of course, in the case of the APE we avoid building this structure explicitly, unless we writeactions to do this, instead traversing the types in a hierarchical fashion. Or, looking at it another way, thecall graph will end up mirroring the structure of the associated implicit decode tree. Given that encoding isthe inverse of decoding, our starting point for an encode is to construct a hierarchical representation of thedesired semantic content of the packet. Now whether we actually build an explicit tree, or simply executea call graph that mimics this structure, is just one of our choices. For some protocols it is possible for theAgilent Restricted Agilent Technical Report, No. AGL-2009-1, May 2009.


calls to directly populate the bits in the PDU, avoiding the need to build an explicit hierarchical structureprior to encoding. But in the general case it is difficult to do this, and we will need to build a tree and thentraverse it to construct the PDU. The APE can already generate classes to represent decode trees, and thesecould be extended to support the encoding task. This sounds deceptively simple, but there are a number ofissues which complicate this process.

The first question is which fields in a type should be represented as children in such an explicit decode tree.For example, consider something really simple such as an IPv4 header. We could demand the user builds anode for each field within this header. But there seems little point in constructing a node representing theversion field, and then checking it has the correct value. Instead the system could simply synthesise sucha node. The header length field is another good example. Requiring the user to provide an explicit lengthwould be both unnecessary and error-prone. The run-time should be able to synthesize the appropriate fieldvalue at the point where the tree is encoded. Another good example is where a field is optional, and anotherbit field records whether the optional field is present or not. In that case we might simply use a null value,in the case of C++, to indicate an absent field. The encoder should then use this value to set the bit fieldappropriately. A CRC field can also be synthesized. These are all examples of a general process. Givena “full” decode tree we should be able to analyze the constraints, and structure of the types, to determinewhich fields can be derived from the structure, or values, of other fields. We can imagine pruning suchelements from the decode tree, and it is the resulting tree that we want the user to build. As with manyAPE-related things, the analysis required to determine whether a field is derivable can be very simple insome cases, but almost intractable in others. However, such analysis is essential if we want to provide anatural interface to building PDUs.

Another issue is more problematic. Consider something like an ASN.1/BER protocol where there can bemultiple ways of encoding an integer value such as a length. Ideally a user would like to just specify thelength as an integer, assuming it cannot be inferred, and then have the system take care of building the treefragment, or bit sequence, representing this value. But this may be ambiguous, i.e. there may be multiplevalid representations that can all be decoded to produce this value. This isn’t a problem for decoding, but itis for encoding. We could try to minimize the encoded length as a tie-breaker, but that’s adhoc, and in somecases it may still be intractable. We essentially have to take the computation describing how to interpret abit sequence as an abstract value and “run it backwards”. For simple cases it is easy to do this, e.g. wherethe type is something like UInt8. But in general this problem can exhaust the ingenuity of any compiler,and we would need to implement an alternative scheme for such situations. We could force the user tobuild such trees manually for the tricky cases. Alternatively we could allow the user to embed “encoding”actions within a specification to support this, i.e. code fragments that would take an integer and build theappropriate type representation, passing the buck to the user for these tricky cases.

So far we have been considering things that make the system easier to use. But is there anything akin toaction-directed decoding that can be applied in such a setting? Suppose we wanted to generate a lot of testpackets, at high speed, where each one had a lot in common with the rest, but where a few fields changedbetween each packet. This might be something as simple as a sequence number, or perhaps as complicatedas the payload. In such cases we can imagine the tree representation of a PDU as having two kinds of nodes.The majority of them will be treated as before, but a few, the transient ones for want of a better term, willbe modifiable. So we can imagine building a partial tree, with holes where the transient nodes will be. Wecan then use this as a template. We fill in the holes and encode the packet. We then fill in the holes withdifferent values and encode the packet again. And we keep doing this until we get bored, or want to builda slightly different template. Now, of course, there is a simple way of doing this. We could wait until theholes have been filled in, and then just perform a normal encoding step on the complete tree. But this wouldobviously be very wasteful of resources, as in some cases almost all the PDU will remain unchanged. If ahole represents a field that will always be filled in using the same representation, with the same size, thenwe could simply build a PDU with gaps, and then when a hole gets filled we would simply fill the gap. Thecost of instantiating the template each time would then be greatly reduced. Of course the situation mightnot be as simple as that. For example, the hole might represent an optional field. Assigning a non-nullfield to this hole might require setting a bit elsewhere in the PDU to record the presence of the field. So, inAgilent Technical Report, No. AGL-2009-1, May 2009. Agilent Restricted


general, instantiating the template may require concatenating bit sequences that were computed at templatecreation time, shifting them, and so on. The challenge would be to analyse which fields were transient ones,and then maximising the work that gets done prior to hole instantiation to minimize the cost of specializingthe template each time. There is plenty of scope for heuristics and other optimizations in this area.

The APE IDE allows the user to attach task-specific properties to fields (and types and modules). Wecould imagine attaching a new property to fields to tag the ‘transient” ones, and then use this informationto guide the construction of the encoders. Implementing such a partial encoding scheme would provide anice complement to the action-directed approach to decoding, and would fulfill one of the original aims ofthe project, namely to generate encoders and decoders from the same specification.

11.6 Flows

The APE has largely focussed on the syntactic aspects of protocol decoding. However, in reality mostpackets cannot be analysed in isolation. Sequences of packets form flows, and state machines at each end ofthe communication pipeline control which packets are sent and resent. To fully analyse a packet sequencean application frequently has to be aware of such flows and state machines. This raises the question ofwhat support, if any, the APE should provide in this area. We could take the view that the work of theAPE stops after syntactically decoding a packet. At the other extreme we could extend the APE to allowthe state machines to be defined, and the results incorporated into the generated decoders. Unfortunatelythe presence of timeouts in such state machines, and the fact that we may have to start monitoring a linkpart-way through a session, makes this task rather challenging to support in an automated fashion. Thereis, however, an intermediate stage that may be worth implementing in the APE. We could extend theconstraint language to indicate which fields in a protocol should be used to represent a “flow”. In some casesthe order of the fields is significant, and in other cases it is not, and so we would need functions to captureboth behaviours. To identify flows that span protocol layers we would introduce a notion of flow nesting.For example, at the IPv4 layer we would define flows using the source and destination addresses. In theTCP layer we would then derive a subflow based on the current IPv4 flow augmented with the source anddestination ports. The intention would be to map flows to flow IDs, and then to allocate flow-specific areasof memory for use by the application code. Whilst the application would still have to manually perform suchtasks as state tracking, at least some of the memory management chores could be offloaded to the APE.Unfortunately at this stage there is no support for flows in the APE and all this is pure speculation.

12. CONCLUSIONS

In this report we have described many of the accomplishments of the APE project. A companion re-port[Kirkham 2009] describes some of the other achievements. The resulting system can be used to supporta variety of decoding tasks. The next step is to gain experience of applying these tools to some real examples,enabling us to evaluate their utility in practice. The APE web site[Mitchell and Kirkham 2009] contains anextensive tutorial to help get people started. Furthermore, as described in Section 11, there is still plenty ofscope for developing the language and tool chain further. We leave those tasks as an exercise for the reader.

REFERENCES

Adobe. 1992. Tiff. http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf.

Boudreau, T., Tulach, J., and Wielenga, G. 2007. Rich Client Programming: Plugging into the NetBeans Platform.

Prentice Hall.

Chambers, C. 1992. Object-oriented multi-methods in Cecil. In ECOOP’92.

Donnelly, C. and Stallman, R. 2006. Bison: The Yacc-compatible parser generator.http://www.gnu.org/software/bison/manual/pdf/bison.pdf.

Grehan, R. 2006. The database behind the brains. db40 White Paper .

Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J.-J., Nielsen, H. F., Karmarkar, A., and Lafon, Y. 2007. SOAP

Version 1.2. http://www.w3.org/TR/soap12-part1/.

hdfgroup.org. 2007a. HDF5 packet tables. Web site.

hdfgroup.org. 2007b. Introduction to HDF5. Web site.

IEEE. 2009. IEEE 802.16 (WiMAX) specification. http://wirelessman.org/pubs/80216Rev2.html.



Intel. 2009. The ChipScope Pro tool. http://www.xilinx.com/ise/optional prod/cspro.htm.

Jancura, J. 2008. Schliemann. http://wiki.netbeans.info/wiki/view/Schliemann.

Johnson, S. C. 1979. YACC: Yet another compiler-compiler. In Unix Programmer’s Manual Vol 2b.

Kirkham, T. 2009. Flexible protocol decoding engines for the APE. Tech. rep., Agilent Labs. In preparation.

Klensin, J. 2001. Simple Mail Transfer Protocol. IETF, RFC 2821.

Knuth, D. E. 1990. The genesis of attribute grammars. In Proceedings of the international conference on Attribute grammarsand their applications.

Kohler, E., Morris, R., Chen, B., Jannotti, J., and Kaashoek, M. F. 2000. The click modular router. ACM Transactions

on Computer Systems 18, 3 (August), 263–297.

Lamping, U., Sharpe, R., and Warnicke, E. 2008. Wireshark user’s guide. http://www.wireshark.org/download/docs/user-guide-us.pdf.

Larmouth, J. 1999. ASN.1 Complete. Morgan Kaufmann.

Liang, S. 1999. The Java Native Interface: Programmer’s Guide and Specification. Prentice Hall.

Lindholm, T. and Yellin, F. 1996. The Java Virtual Machine Specification. Addison-Wesley.

Long, Y. 2002. Programming guide for Honda. Tech. rep., Agilent Technologies, Agilent China Software Design Center,

Beijing. July.

McCann, P. and Chandra, S. 2000. PacketTypes: Abstract specification of network protocol messages. In Proceedings of

ACM SIGCOMM.

Mitchell, K. 2002. Semantic ambiguities in the PacketTypes language. http://isay.britain.agilent.com/ape/External Docu-

ments/PT Ambiguities.pdf.

Mitchell, K. and Kirkham, T. 2009. The APE web site. http://isay.britain.agilent.com/ape/index.php.

Niemeyer, P. 2005. Beanshell scripting: Lightweight scripting for java. Web site.

Parr, T. 2006. A functional language for generating structured text. Tech. rep., University of San Francisco. May.

Parr, T. 2007. The Definitive ANTLR Reference. Pragmatic Bookshelf.

Pierce, B., Ed. 2004. Advanced Topics in Types and Programming Languages. The MIT Press.

Pilato, C., Collins-Sussman, B., and Fitzpatrick, B. 2008. Version Control with Subversion. O’Reilly.

Protocol Tools Team. 2002. Protocol Analysis Language - Technical Guide. http://isay.britain.agilent.com/ape/External

Documents/PAL/PAL guide.pdf.

Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and Schooler,E. 2002. SIP: Session Initiation Protocol. IETF, RFC 3261.

Shalit, A. 1996. The Dylan Reference Manual. Apple Press.

Thurston, A. 2009. Ragel state machine compiler. http://www.complang.org/ragel/.