symbolic probabilistic analysis of o -line guessing · guessing attacks on the eke protocol [29]...

43
Symbolic Probabilistic Analysis of Off-line Guessing Bruno Conchinha 1 , David Basin 2 , and Carlos Caleiro 3 1 Information Security Group, ETH Z¨ urich, Z¨ urich, Switzerland 1,2 [email protected] 1 , [email protected] 2 2 SQIG - Instituto de Telecomunica¸c˜oes, Department of Mathematics, IST, TU Lisbon, Portugal [email protected] 3 Abstract. We introduce a probabilistic framework for the automated analysis of security protocols. Our framework provides a general method for expressing properties of cryptographic primitives, modeling an at- tacker who is more powerful than conventional Dolev-Yao attackers. Within our framework, we can model equational properties of crypto- graphic primitives as well as property statements about their weaknesses, e.g. primitives leaking partial information about messages or the use of weak algorithms for random number generation. Moreover, we can use these properties to find attacks and estimate their success probability. Existing symbolic methods can neither model such properties nor find such attacks. We show that the probability estimates we obtain are negligibly different from those yielded by a generalized random oracle model based on sam- pling (the random variables associated to symbolic) terms into bitstrings, while respecting the stipulated properties of cryptographic primitives. As case studies, we use a prototype implementation of our framework to model non-trivial properties of RSA encryption and automatically estimate the probability of off-line guessing attacks on the EKE protocol. Keywords: Probability, Off-line Guessing, Equational Theories, Random Ora- cle Model 1 Introduction Cryptographic protocols play an important role in securing distributed compu- tation and it is crucial that they work correctly. Symbolic verification approaches are usually based on the Dolev-Yao model : messages are represented by terms in a term algebra, cryptography is assumed to be perfect, and properties of crypto- graphic operators are formalized equationally [1]. This strong abstraction eases analysis and numerous successful verification tools rely on it [2,3]. However, it may not accurately represent an attacker’s capabilities. As a consequence, broad classes of attacks that rely on cryptanalysis or weaknesses of cryptographic prim- itives fall outside the scope of such methods.

Upload: phungkien

Post on 28-Apr-2019

218 views

Category:

Documents


0 download

TRANSCRIPT

Symbolic Probabilistic Analysis of O!-lineGuessing

Bruno Conchinha1, David Basin2, and Carlos Caleiro3

1 Information Security Group, ETH Zurich, Zurich, Switzerland1,2

[email protected], [email protected] SQIG - Instituto de Telecomunicacoes, Department of Mathematics,

IST, TU Lisbon, [email protected]

Abstract. We introduce a probabilistic framework for the automatedanalysis of security protocols. Our framework provides a general methodfor expressing properties of cryptographic primitives, modeling an at-tacker who is more powerful than conventional Dolev-Yao attackers.Within our framework, we can model equational properties of crypto-graphic primitives as well as property statements about their weaknesses,e.g. primitives leaking partial information about messages or the use ofweak algorithms for random number generation. Moreover, we can usethese properties to find attacks and estimate their success probability.Existing symbolic methods can neither model such properties nor findsuch attacks.We show that the probability estimates we obtain are negligibly di!erentfrom those yielded by a generalized random oracle model based on sam-pling (the random variables associated to symbolic) terms into bitstrings,while respecting the stipulated properties of cryptographic primitives.As case studies, we use a prototype implementation of our frameworkto model non-trivial properties of RSA encryption and automaticallyestimate the probability of o!-line guessing attacks on the EKE protocol.

Keywords: Probability, O!-line Guessing, Equational Theories, Random Ora-cle Model

1 Introduction

Cryptographic protocols play an important role in securing distributed compu-tation and it is crucial that they work correctly. Symbolic verification approachesare usually based on the Dolev-Yao model : messages are represented by terms ina term algebra, cryptography is assumed to be perfect, and properties of crypto-graphic operators are formalized equationally [1]. This strong abstraction easesanalysis and numerous successful verification tools rely on it [2, 3]. However, itmay not accurately represent an attacker’s capabilities. As a consequence, broadclasses of attacks that rely on cryptanalysis or weaknesses of cryptographic prim-itives fall outside the scope of such methods.

Proving security by reasoning directly about bitstrings, as in computationalapproaches [4–6], yields stronger security guarantees. However, it requires long,error-prone, hand-written proofs to establish the security of given protocols usingspecific cryptographic primitives.

Much research has been devoted to bridging the gap between these two ap-proaches. Broadly speaking, this line of work aims to automatically obtain strongprotocol security guarantees under su"ciently strong assumptions on the secu-rity of cryptographic primitives. Good surveys of this work include [7, 8]. Wediscuss existing approaches and their limitations in greater detail later in thissection.

Related work. Much work has been devoted to bridging the gap between symbolicand computational models. There are two main lines of research in this direction:(1) obtaining computational soundness results for symbolic methods, and (2)developing techniques that reason directly with computational models.

The first line of research, developing computational soundness results, wasinitiated with Abadi and Rogaway’s seminal paper [9]. They investigated as-sumptions under which symbolic security implies computational security, where-by protocols that are secure against a Dolev-Yao attacker are also secure againstthe much more powerful adversary of the computational world. Such resultsnow exist for more general protocols and stronger attackers [10, 11], observa-tional equivalence [12], length revealing and same-key revealing encryption sys-tems [13], encryption with composed keys [14] and hash functions [15], to namebut a few. The limitations of these results include the very strong assumptionson the security of cryptographic primitives, the requirement that messages aretagged so that their structure is known to any observer, and the di"culty ofextending the results to new cryptographic primitives, which usually involvesre-doing most of the work.

The second line of research aims to obtain, automatically where possible, asequence of property-preserving transformations between game-theoretic prob-lem formulations, as is done in computational security proofs. The original ideasin this direction of [16–18] have been developed into tools like CryptoVerif [19],CertiCrypt [20] and EasyCrypt [21]. Moreover, [22] introduces a logic for rea-soning about cryptographic primitives in such models. These tools can proveprotocols correct in the computational world and, when successful, provide up-per bounds on the probability of an attack.

More recently, [23] proposes yet another approach: a symbolic frameworkin which it is possible to express security properties of cryptographic primitivesand use them to prove computational protocol security. However, failure to ob-tain a security proof does not necessarily yield a feasible attack, and so far noautomated method exists to prove protocol security in this setting.

Our applications in this paper focus on o!-line guessing attacks. O!-lineguessing attacks have been widely studied in symbolic settings by using staticequivalence [24, 25] and closely related notions [26]. In [27] a computationalsoundness result is given, albeit in a rather restricted scenario. However, o!-lineguessing attacks remain a real threat to protocol security. Password-cracking

software is freely available on the internet, and is remarkably successful [28].Furthermore, such attacks often rely on weaknesses of cryptographic primitivesthat cannot be modeled by existing automated methods [29, 30].

Contributions. We present a fundamentally new approach to strengthening thesecurity guarantees provided by automated methods. Our approach is in a sensedual to the current main lines of research that aims to bridge the gap betweensymbolic and the computational models: Rather than assuming strong secu-rity properties of cryptographic primitives and using them to prove security, weexplicitly describe weaknesses of cryptographic primitives and random numbergeneration algorithms and use them to find attacks.

We propose a probabilistic framework for security protocol analysis in whichproperties of cryptographic primitives can be specified. Besides equational prop-erties, our framework allows us to express security relevant properties of randomnumber generation algorithms and relations between the input and the output ofcryptographic primitives. For instance, it can model a random number genera-tion algorithm that generates bitstrings representing primes of a certain length,a hash function that leaks partial information about the original message, ora cryptosystem whose valid public keys have some redundancy. The specifiedproperties can then be used to find attacks and to estimate their success proba-bility. Such properties cannot be modeled by existing symbolic methods and yetoften lead to attacks on real-world implementations.

We model cryptographic functions using a generalized random oracle model.Given a concrete specification of the cryptographic primitives used and theirproperties, symbolic terms are sampled to bitstrings in a way that ensures thatthe properties of the specification are always satisfied, but otherwise functionsbehave as random oracles. Under reasonable assumptions on the specification,we can define such generalized random oracles and prove that they yield a validprobabilistic model. Moreover, we show that probabilities in this model can bee!ectively computed, and we provide a prototype implementation that calculatesthese probabilities.

We believe that this model is interesting in its own right. It is a non-trivialgeneralization of the standard model of random oracle for hash functions, andit captures the intuitive idea that cryptographic primitives satisfy stated prop-erties, which can be explored by an attacker, but otherwise behave ideally.

We illustrate the usefulness of our framework by using it to analyze the secu-rity of protocols against o!-line guessing attacks. Given the pervasive use of weakhuman-picked passwords, o!-line guessing attacks are a major concern in securityprotocol analysis and have been the subject of much research, using symbolic[24–26] and computational approaches [31], and relating the two via compu-tational soundness results [27, 32]. We show that our framework can be usedto straightforwardly represent non-trivial properties of cryptographic primitiveslike the redundancy of RSA keys. We will use these properties to model o!-lineguessing attacks on the EKE protocol [29] and estimate their success proba-bilities using our implementation. Although these attacks are well-known, theiranalysis was previously outside the scope of symbolic methods. Further applica-

tions of our approach (not described here) include reasoning about di!erentialcryptanalysis or side-channel attacks [33] as well as short-string authenticationand distance-bounding protocols.

Outline. In Section 2 we describe our framework’s syntax and semantics. In Sec-tion 3 we introduce our generalized random oracle model and show that it yieldsa computable probability measure. In Section 4 we show how our frameworkcan be used to find realistic o!-line guessing attacks that rely on non-trivialprobabilistic properties of the cryptographic primitives. In Section 5 we drawconclusions and discuss future work. Full proofs of all results are given in Ap-pendix A. Appendix B explains how to compute probabilities in our probabilisticmodel.

2 Definitions

Symbolic terms. A signature ! =!

n!N !n is a set of function symbols, where!i contains the symbols of arity i. Given a set G of generators, we define T!(G)as the smallest set such that G ! T!(G), and if f " !n and t1, . . . , tn " T!(G),then f(t1, . . . , tn) " T!(G). For simplicity, we write c instead of c() if c " !0.Below, unless otherwise stated, we will consider G = # and write T! instead ofT!(#).

We define the head of a term t = f(t1, . . . , tn) by head(t) = f . We definethe set sub(t) of subterms of a term t = f(t1, . . . , tn) inductively by sub(t) ={t} $ (

"ni=1 sub(ti)), as usual, and the set psub(t) of proper subterms of t by

psub(t) = sub(t) \ {t}. If f :A % B and A" ! A, we write f [A"] for the image{f(a) | a " A"} of A" under f .

Equational theories. Algebraic properties of symbolic terms are captured byequational theories. Given a signature !, an equational theory & is a congruencerelation on T! . As usual, we write t & t" instead of (t, t") " &. We will consideran equational theory &R obtained from a subterm convergent rewriting systemR, as in [34].

Example 1. The signature !DY is used to represent the cryptographic primi-tives present in simple Dolev-Yao models containing a hash function, symmet-ric encryption and decryption, pairing and projections. It is given by !DY =

!DY1 $!DY

2 , where !DY1 = {h,"1,"2} and !DY

2 =#{|·|}· , {|·|}

#1· , '·, ·(

$.

Standard equational properties of these primitives are represented by therewriting system RDY containing the rules "1('x, y() % x, "2('x, y() % y, and#%%%{|x|}y

%%%$#1

y% x. It is simple to check that this rewriting system is convergent.

Property statements. Property statements represent properties of function sym-bols by expressing relations between their inputs and outputs.

Let T be a set of types. Given a signature !, a property statement is a tuple(f, T1, . . . , Tn, T ), written f [T1, . . . , Tn] ! T, where f " !n and T1, . . . , Tn, T "

T . If ps = (f [T1, . . . , Tn] ! T ), we define the head symbol of ps by head(ps) = f ,dom(ps) = T1 ) . . .) Tn and ran(ps) = T .

Given a set PS of property statements and f " !, we denote by PSf the setof property statements in PS whose head symbol is f . Note that, in general, wemay have more than one property statement associated to each function symbol.We write f [T1, . . . , Tn] !PS T instead of (f [T1, . . . , Tn] ! T ) " PS.

Syntax. The syntax of our setup is defined by a four-tuple '!,&R, T ,PS, !·"(,where ! is a signature, &R is an equational theory on T! defined by a

convergent rewriting system R, T is a set of types, and PS is a set of propertystatements.

We require that the signature contains an infinite number of constants, thatis, !0 is infinite, and that ! \ !0 is finite. Constant symbols represent eithercryptographically relevant constants (such as the constant bitstring 0 for XOR)or random data generated by agents or the attacker.

Interpretation functions. Type interpretation functions associate to each type aset of bitstrings, thereby providing a meaning to types. We write B for {0, 1}.Given a set T of types, a type interpretation function is a function !·" : T %P(B$) that associates types to sets of bitstrings such that !T " is finite and non-empty for every T " T . We extend this interpretation function to products, anddefine !T1 ) . . .) Tn" = !T1" ) . . .) !Tn".

A setup specification is a pair S = 'S, !·"(, where S = '!,&R, T ,PS, !·"(is a four-tuple defining the syntax of the setup as in the above paragraph and!·" is an interpretation function, which consistently defines the behavior of allfunction symbols: Concretely, we require that PSf *= # for all f " !, and ifps1, ps2 " PSf then !dom(ps1)" + !dom(ps2)" = #. For c " !0, these conditionsimply that there is a single T " T such that c !PS T . We denote this unique Tby type(c).

We assume that the function represented by a function symbol is unde-fined unless otherwise stated by a property statement concerning that func-tion symbol: that is, we assume that, if f " !n and there is no ps " PSfsuch that (b1, . . . , bn) " !dom(ps)", then the function represented by the sym-bol f is undefined on the input (b1, . . . , bn). Under these conditions, we set thedomain of definability of f to be domS(f) =

!ps!PSf

!dom(ps)". Note that

# ! domS(f) ! (B$)n for all f " !.

Example 2. We specify a simple yet realistic setup extending the typical Dolev-Yao model, by modeling a hash function h that maps any bitstring to a bitstringof length 256, a pairing function that given any pair of bitstrings returns theirlabeled concatenation, and a symmetric encryption scheme that uses a blockcipher together with some reversible padding technique.

We first enrich the signature !DY with constants ci for i " N. The rewritingsystem RDY remains unchanged. The types we will consider and their interpre-tations under !·" are as follows:

– pw represents weak (e.g., human-chosen) passwords. We model these pass-words as being encoded by 256-bit bitstrings, but sampled from a relativelysmall set: thus, !pw" , B256 and |!pw"| = 224;

– sym key represents symmetric keys, with !sym key" = B256;– text represents one block of plaintext, with !sym key" = B256;– TBn with !TBn" = Bn for each n " N;– TB(n,m) with !TB(n,m)" = B(n,m) =

"mi=n Bi for each n,m " N; and

– TBn#m represents the set of bitstrings that are labeled concatenations of twobitstrings of size n and m, with !TBn#m" = Bn#m ! Bn+m+%log(n+m)&, foreach n,m " N.

We define the set PS by

PS = { h[TBn ] ! TB256 ,"1[TBn#m ] ! TBn ,"2[TBn#m ] ! TBm ,'TBn , TBm( ! TBn#m ,{|TB(256n+1,256(n+1) |}TB256

! TB256(n+1) ,

{|TB256(n+1) |}#1TB256

! TB(256n+1,256(n+1)) | n,m " N.}.

Note that all functions are modeled as undefined on all arguments that falloutside the domains of these property statements. For example, symmetric en-cryption of any term is undefined unless the key is a 256-bit bitstring.

Example 3. We use our framework to formalize RSA encryption, taking intoaccount properties of the key generation algorithm. An RSA public key is a pair(n, e), where n = p · q and p, q are large primes (typically of around 512 bits),and the exponent e is relatively coprime to #(n) = (p - 1)(q - 1). The privatekey d is the multiplicative inverse of e modulo #(n).

We extend the setup specification of Example 2. We add to the signature thefollowing five primitives: the unary functions mod, expn, and inv, representingthe extraction of the modulus, the exponent, and the exponent’s multiplicativeinverse, respectively, from a randomly generated RSA public-private key pair; abinary function {·}#1

· " !DY2 , representing the RSA decryption function; and a

ternary function {·}·,·, representing RSA encryption.The only rewriting rule that we must add to model RSA encryption is#

{m}mod(k),expn(k)

$#1

inv(k)% m, where m, k are variables.

The additional types that we will use to formalize relevant properties of thesefunctions and their interpretations are as follows: random represents the randomvalues used to generate an RSA public-private key pair, including two 512-bitprime numbers and the 1024-bit exponent, with !random" ! B2048; prodprimerepresents the product of two 512-bit prime numbers, so that !prodprime" ! B1024

and |!prodprime"| & 21008 (by the prime number theorem); odd represents 1024-bit odd numbers, with !odd" ! B1024 and |!odd"| = 21023.

We add the following property statements:

– mod[random] ! prodprime, because the modulo of an RSA public key is theproduct of two primes.

– expn[random] ! odd, because the exponent of an RSA public key is alwaysodd.

– inv[random] ! TB1024 , because an RSA private key is a 1024-bit bitstring.Note that we do not allow extracting modulos, exponents, or inverses fromanything other than a valid value for generating an RSA key pair.

– {TB1024}prodprime,odd ! TB1024 . This property states that encrypting any 1024-bit plaintext with a valid RSA public key yields a 1024-bit bitstring. Notethat the encryption is undefined if the plaintext is not a 1024-bit bitstring,the modulus is not the product of two primes, or the exponent is even.

– {TB1024}#1TB1024

! TB1024 . RSA decryption takes a ciphertext and a privatekey which are both 1024-bit bitstrings, and outputs a 1024-bit plaintext.

For simplicity, we merely require here that the public-key exponent is odd,rather than requiring it to be coprime with the modulus.

2.1 Semantics

Let us fix a setup specification S = ''!,&R, T ,PS(, !·"(.

Term assignments. Term assignments associate a value to each symbolic term.Given S, recall that the domain of definability of a function symbol f " !n,domS(f), may be a proper subset of (B$)n. Therefore, an error/undefined value,represented by ., is also considered. We will write B$

' for the set B$ $ {.}.Note that . never occurs in the domain of definability of any function; thus,applying a function to an undefined argument always yields an undefined result,as expected.

Let ! be the set of all functions $:T! % B$'. We say that $ " ! satisfies the

equational theory &R, and we write $ |= &R, if, whenever t &R t", either $(t) =$(t"), or $(t) = ., or $(t") = .. We say that $ satisfies a property statementps (under !·"), and write $ |=!·" ps if, whenever ($(t1), . . . ,$(tn)) " !dom(ps)",then $(f(t1, . . . , tn)) " !ran(ps)", and whenever ($(t1), . . . ,$(tn)) /" !dom(ps)"for all ps " PSf , then $(f(t1, . . . , tn)) = .. We say that $ satisfies PS (under!·"), and write $ |=!·" PS, if $ |=!·" ps for all ps " PS.

We say that $ satisfies the setup presentation S, and write $ |= S, when$ |= &R and $ |=!·" PS. We denote by %S the set of all $ " ! that satisfy S.

Example 4. Functions $ that satisfy our equational theory may be such that$(t) = . and $(t") *= . for terms t and t" such that t &R t". To see why thisis allowed, recall from Example 2 that {|·|}#1

· represents a symmetric encryptionalgorithm in which valid keys always have 256 bits. Let t, k " !0, with type(t) =text, and t" = {|{|t|}k|}

#1k . We have t &R t". Now, if $ represents a possible real-

world assignment (of terms to bitstrings), we have $(t) *= . (since t represents abitstring freshly sampled from B256). Moreover, if $(k) is not a 256-bit bitstring,then $(t") = . since our encryption and decryption functions are only definedfor 256-bit keys. Therefore, $({|{|t|}k|}

#1k ) = ..

Finitely-generated events. Valid protocol execution traces are finite and, there-fore, contain only finitely many terms. We are therefore interested in events thatdepend on only finitely many terms. For each finite set of terms K ! T! , let &K

be the set of functions ':K % P(B$') and, for each ' " &K , let

%" = {$ " ! | $(t) " '(t) for each t " K}.

Let & ="

K!Pfin(T!) &K and %# = {%" | ' " &}, where Pfin(X) is the set of

finite subsets of X. Note that %# is the set of subsets of ! whose specificationdepends on only the instantiation of finitely many terms. Thus, we want ourprobability measure to be defined in the (-algebra generated by %#. Let F bethis (-algebra; we say that F is the (-algebra of finitely generated events.

Probabilistic models. Given a setup specification S, we consider probabilityspaces (!,F , µ), where ! and F are as defined above and µ:F % [0, 1] is aprobability measure. Note that ! and F are fixed for a given S; it is the prob-ability measure µ that we are interested in studying.

If t " T! , we write &t:! % B$' to denote the random variable on ! defined

by &t($) = $(t). We adopt standard (abuses of) notation from probability theory.If C(b1, . . . , bn) is a condition whose satisfaction depends on the bitstring valuesb1, . . . , bn, we use the notational convention that

Pµ[C(&t1, . . . , &tn)] = µ('$ " ! | C(&t1($), . . . , &tn($))

(),

provided that'$ " ! | C(&t1($), . . . , &tn($))

(" F . If % " F , we will also write

Pµ[%] instead of µ(%). We use standard notation for conditional probability.Namely, if Pµ[B] > 0, then Pµ[A | B] = Pµ[A,B]/Pµ[B] is the conditionalprobability of A given B.

We say µ satisfies the equational theory &R if µ({$ | $ |= &R}) = 1, andwe write µ |= &R to denote this fact. Analogously, we define the satisfac-tion of the set of property statements PS (under !·") by µ, µ |=!·" PS, byµ('$ | $ |=!·" PS

() = 1. We say that µ satisfies, or is a model of, the setup

specification S, written µ |= S, if µ |= &R and µ |=!·" PS. Note that µ is a modelof S if and only if µ(%S) = 1.

3 A Generalized Random Oracle Model

In this section we propose an algorithm for sampling the random variables as-sociated with symbolic terms. Our sampling algorithm interprets functions asrandom oracles subject to satisfying our setup specification S = '!, &R, T , PS,!·"(.

3.1 Tentative term sampling in the ROM

Term sampling. Suppose that K , T! is a finite set of terms and P is a partitionof K. We define &P to be the smallest congruence relation on T! such that&R ! &P and t &P t" whenever there is p " P such that t, t" " p.

The sampling algorithm below builds a function )ROM mapping a finite setof terms to B$

'. We denote by P ()ROM) the partition of dom()ROM) given byP ()ROM) =

')#1ROM(b) | b " ran()ROM)

(. The algorithm is probabilistic: at var-

ious steps, it samples a random bitstring from a finite subset of B$'. We assume

that this sampling is always done with uniform probability distribution. We alsoassume fixed some total order / on the set of terms such that, if t " psub(t"),then t / t". We say that such an order is subterm-compatible.

Algorithm 1 (Tentative Term Sampling Algorithm)Input: a finite set of terms K ! T!.Output: a function )ROM: sub[K] % B$

'.

1: )ROM 0 #2: let t1, . . . , tk be such that t1 / . . . / tk and sub[K] = {t1, . . . , tk}3: for i from 1 to k4: let ti = f(t"1, . . . , t

"n)

5: if ()ROM(t"1), . . . )ROM(t"n)) /" domS(f)6: )ROM(ti) 0 .7: continue8: let ps be the unique ps " PSf s.t. ()ROM(t"1), . . . )ROM(t"n)) " !dom(ps)"9: if 1t""dom()ROM). t&P ($ROM)t

" and )ROM(t") *= .10: )ROM(ti) 0 )ROM(t")11: continue12: randomly sample b from !ran(ps)"13: )ROM(ti) 0 b14: return )ROM

Algorithm 1 samples terms in order (lines 2–3), by interpreting each functionsymbol as a random oracle with uniform probability distribution (lines 12–13),and respecting the equational theory in case an equal term has already beensampled (lines 9–10), as long as all its argument values (previously sampled) aredefined and form a tuple in its domain of definability (lines 5–6).

Problems with the tentative term sampling algorithm. We show that Algorithm 1does not necessarily yield a probability measure over F as desired.

Given a finite set K ! T! and a subterm-compatible order /, Algorithm 1is a probabilistic algorithm, and thus outputs a function ): sub[K] % B$

' withsome probability distribution. We would therefore like to define a model µ of Sby defining µ(%") for each generator %" of F as the probability that executingAlgorithm 1 on input dom(') yields as output a function )ROM such that, foreach t " dom('), )ROM(t) " '(t).

Unfortunately, the next example shows that this is not well-defined in general.More precisely, we consider two terms, t and a, and show that the algorithmsamples t and a to the same bitstring with a probability that depends on theinput set K and the order relation /. Thus, letting 'b = {t 2% b, a 2% b} for eachb " B$

', we have that the probability of the (measurable) set"

b!B!"%"b depends

on the input set K and the order relation / considered.

Example 5. Suppose that a, b, k " !0 are such that type(a) = TB1024 , type(b) =TB1024 and type(k) = random. Consider executing Algorithm 1 on the set {t},with t =

#{a}mod(k),expn(k)

$#1

b. Algorithm 1 outputs a function ): sub(t) % B$

'.

Let us consider the probability that )(t) = )(a). It is simple to check that both)(t) and )(a) are sampled by Algorithm 1 with uniform probability distributionfrom B1024. Therefore, the probability that )(t) = )(a) is 2#1024.

Now, consider executing Algorithm 1 on the set {t, inv(k)}. If t / inv(k), thenthe execution of Algorithm 1 will be exactly the same until )(s) is sampled forall terms s " sub(t), and )(inv(k)) is only sampled afterwards. Therefore, )(s)is sampled according to the same probability distribution for all s " sub(t), andthe probability that )(t) = )(a) is still 2#1024. However, if inv(k) / b, we havea probability of 2#1024 that )(b) = )(inv(k)). If )(b) = )(inv(k)), then we have)(t) = )(a) with probability 1. Otherwise, )(t) and )(a) will still be sampledfrom B1024 with uniform probability distribution, and the probability that theyare sampled to the same value is again 2#1024. In this case, we conclude that

P [)(a) = )(t)] = 2#1024 · (2- 2#1024) *= 2#1024.

Thus, the probability that )(t) = )(a) depends on both the set of terms Kinput to the algorithm and the order /.

Nevertheless, the following result shows that, given a fixed a finite set oftermsK and a subterm-compatible order /, Algorithm 1 does yield a probabilitydistribution on the (-algebra FK generated by the set {%" | ' " &K}. We remarkthat FK is the set of (-algebra of events that depend only on the instantiationof terms in the set K.

Theorem 2. There is a unique probability distribution µK,(:Fsub[K] % [0, 1]

such that, for each ' " &K , µK,((%") is the probability that executing Algorithm1 on input K and using the order / yields a function )ROM such that, for eacht " K, )ROM(t) " '(t).

3.2 Revised term sampling in the ROM

To avoid problems like the one illustrated by Example 5 we need two additionalhypotheses on the setup specification S. We will explicitly distinguish a set ofweak function symbols and consider a revised algorithm that uses this distinc-tion. This revised algorithm is equivalent to Algorithm 1 when all functions aretreated as weak. We show that, under these hypotheses, we can define a prob-ability measure from this new sampling algorithm, while also simplifying thecalculation of probabilities.

Weak terms. We assume fixed a set !W ! ! of weak function symbols. Wesay that a term t " T! is weak if head(t) " !W , and denote by TW the setof weak terms. Intuitively, weak function symbols are those that represent func-tions whose outputs are sampled from “small” sets, and a probabilistic model

must therefore take into account the possibility of collisions between them. Bycontrast, non-weak function symbols are those that represent functions whoseoutputs are sampled from large enough sets, so that ignoring the possibility ofcollisions changes our probability estimates only negligibly. Theorem 6, statedbelow, formalizes this idea.

Example 6. In our running example, we consider the set of weak function sym-bols !W = {h} $ {a " !0 | a !PS pw}. That is, a term is weak if it is an hashor if it is derived from a humanly-chosen password.

Term sampling revisited. If K and K " are sets of terms and P is a partition ofK, we let P |K#= {p +K " | p " P}. Note that P |K# is a partition of K +K ". Wedenote by W ()ROM) the partition P ()ROM) |TW .

Our revised term sampling algorithm, targeted at solving the anomaly de-scribed in Example 5, is the same as Algorithm 1 with the exception that wereplace the condition t&P ($ROM)t

" by t&W ($ROM)t" in line 9.

Algorithm 3 (Term Sampling Algorithm)Input: a finite set of terms K ! T!

Output: a function )ROM: sub[K] % B$' Pseudocode identical to Algorithm 1,

but with line 9 replaced by:

9: if 1t""dom()ROM). t&W ($ROM)t" and )ROM(t") *= .

This revised algorithm yields a probability distribution on F provided thatthe setup specifiction S satisfies two reasonable conditions, as follows.

Disjointness. The first condition we require on the specification S is that weakfunction symbols do not occur in the rewriting system R.

Intuitively, this disjointness condition implies that the equality of terms de-pends only on the equalities between their weak subterms. Therefore, samplingterms in a di!erent order does not a!ect any equalities because terms are sam-pled only after all their subterms have been sampled. This is condition ex-cludes cases like the one described in Example 5: because inv /" !W , even if

)ROM(b) = )ROM(inv(k)), we never have#{a}mod(k),expn(k)

$#1

b&W ($ROM) a.

The key idea is that equalities between non-weak terms may be disregarded, asthe terms are equal only with negligible probability. We remark that ignoringequalities between non-weak terms, besides allowing us to consistently define aprobability measure, also simplifies the calculation of probabilities. In AppendixA we present a simple algorithm for deciding &P (that is, given terms t and t",decide whether t &P t"), and thus to perform the test in Line 9 of Algorithms 1and its revised version.

Compatibility. The second condition we require on our setup is compatibility.Let K be a finite set of terms and P be a partition of K. Recall the definition of&P given in Section 3. We say that P is &R-closed if, for all t, t" " K, if t &P t",then there is p " P such that t, t" " p. We are interested in partitions of weak

terms. Thus, given a finite set K, we denote by PWR (K) the set of &R-closed

partitions of sub[K] + TW .

Example 7. Consider the set of termsK =#h({|{|t|}k|}

#1k# ), h(t)

$, with k !PS pw,

k" !PS pw, and t /" TW . Then,#{k, k"} ,

#h({|{|t|}k|}

#1k# )

$, {h(t)}

$is a partition

of sub[K] + TW that is not &R-closed.

A selection function for K is a function *: sub[K] % PS$ {.} such that, foreach t " sub[K], either *(t) = . or head(*(t)) = head(t). Given $ " !, we saythat $ satisfies * if, for all t = f(t1, . . . , tn) " sub[K], either ($(t1), . . . ,$(tn)) "!dom(*(t))" and $(t) " !ran(*(t))", or ($(t1), . . . ,$(tn)) /" domS(f) and *(t) =$(t) = .. We denote by I(K) the set of selection functions for K, and byIS(K) ! I(K) the set of selection functions * for K such that there is $ " !that satisfies *. In Appendix A we show that, given a finite set of terms K, IS(K)is a finite and computable set.

A selection function for a finite set K determines which property statementapplies to each term in sub[K]. Note that, if $ " ! is an assignment satisfying PSand K is a finite set of terms, there exists exactly one selection function * " I(K)satisfied by $: it is the function * that associates each term f(t1, . . . , tn) to theunique property statement ps " PSf such that ($(t1), . . . ,$(tn)) " !dom(ps)",or . if no such ps exists.

The compatibility condition is that, if K is a finite set of terms, t " sub[K],P " PW

R (K), * " IS(K), and *(t) *= ., then there is t" " sub(t) such thatt &P |psub(t)

t" and, whenever t"" " sub[K] and t &P |psub(t)t"", either *(t"") = . or

!ran(*(t"))" ! !ran(*(t""))".Intuitively, the compatibility condition requires the equational theory &R

and the property statements in PS to be compatible. It is a basic requirementthat should be satisfied by any meaningful setup specification. The followingexample illustrates this.

Example 8 (Incompatibility between &R and PS). Consider a rewriting system R

containing the symmetric decryption rewrite rule#%%%{|x|}y

%%%$#1

y% x and the prop-

erty statements {|TB256 |}#1TB256

! TB128 , {|TB256 |}TB256! TB256 . Let t" = {|{|t|}k|}

#1k ,

where t, k " !0 and type(t) = type(k) = TB256 . In this case, we have *(t) = TB256

and *(t") = TB128 for all selection functions * " IS({t, t"}). We have t &R t",!ran(*(t))" = TB256 , and !ran(*(t"))" = TB128 . Because B128 +B256 = #, it followsthat there is no $ " ! that satisfies &R and PS.

Note, however, that having {|TB256 |}#1TB256

! TB256 , instead of {|TB256 |}#1TB256

!TB128 , we could have type(t) = B for any non-empty set B ! B256 withoutviolating our compatibility condition.

Example 9. With the choice of!W described in Example 6, our running example(described in Examples 1, 2, 3) and 4 satisfies the disjointness and compatibilityconditions.

Probability measure. We show that, under the disjointness and compatibilityhypotheses, the revised sampling algorithm yields a probability measure µROM

of S. For each total subterm-compatible order /, each ' " &, and each finite setof terms K such that dom(') ! K, let µK,((') be the probability that executingthe sampling the revised version of Algorithm 1 on input dom(') using the order/ yields a function )ROM: sub[K] % B$

' such that )ROM(t) " '(t) for all t " K.

Theorem 4. Suppose that the disjointness and compatibility conditions are sat-isfied for the subterm-compatible orders / and /". Then, if ','" " & are suchthat %" = %"# , we have µ((') = µ(#

('").

In light of Theorem 4, we define the function µROM:%# % [0, 1] for each' " & as µROM(%") = µ((') for any subterm-compatible order /.

Theorem 5. There exists a unique extension of µROM to F that is a probabil-ity measure. Using the same symbol µROM to refer to this extension, we haveµROM(%S) = 1. Hence, µROM is a model of S.

We adopt the abuse of notation used in the above theorem and use thesame symbol µROM to refer to the unique extension of µROM to F that is aprobability measure. In Appendix B we show how to compute probabilities usingthe probability measure µROM.

3.3 Comparing the two probability measures

We describe the relationship between the probability measures µK,( describedin Theorem 2 and the probability measure µROM described in Theorem 5.

For each f " !, let Lf = minps!PSf!ran(ps)" and L = minf!!\!W Lf .

Note that, if we assume that non-weak terms are always sampled from “large”sets of bitstrings whenever they are defined, then L is large as well. Intuitively,Theorem 6 shows that, if this is the case, the di!erent probability measureswe have described coincide except on a set whose probability is “small”. Moreprecisely, the two probability measures coincide except on a set whose probabilityis a polynomial function of 1/L.

Theorem 6. For any finite set of terms K, there exists a set %(K) such that,for any subterm-compatible order /:

(1) for any ' " &K , µK,(% (%" +%(K)) = µROM,%(%" +%(K));

(2) there exists a constant c(K) such that

µK,(% (! \%(K)) = µROM,%(! \%(K)) 3 c(K) · |IS(K)| · (1/L).

Note that the statement of Theorem 6 is stronger than merely bounding thedi!erence in the probability of sets in %#. For example, Theorem 6 implies thatthe probability of two terms being sampled to the same bitstring as measuredby the two di!erent probability measures is also bound by c(K) · |IS(K)| · (1/L).

Asymptotic interpretation. Suppose that, for each + " N, !·"% is a type inter-pretation function and S% = '!,&R, T ,PS, !·"%( is a setup specification whichsatisfies the disjointness and compatibility conditions. Assume further that 1/L%

is negligible as a function of +, where Lf,% = minps!PSf!ran(ps)"% and L% =

minf!!\!W Lf,% for each + " N. Note that this condition is equivalent to requir-

ing, for each function symbol f " !\!W and each ps " PSf , that 1/%%%!ran(ps)"%

%%%is negligible as a function of +. Intuitively, this condition requires that non-weakterms, when defined, are always mapped to bitstrings sampled from large enoughsets. Specifically, the sizes of the sets from which outputs of f are sampled shouldgrow faster than any polynomial as a function of the parameter +.

Let µK,(% (respectively, µROM,%) be the probability measure given by Theorem

2 (respectively, Theorem 5) when Algorithm 1 (respectively, the revised versionof algorithm 1) is executed using the interpretation function !·"%. Then, thefollowing is a corollary of Theorem 6.

Corollary 1. Let K be a finite set of terms, and suppose that%%IS" (K)

%% growspolynomially as a function of +. For any finite set of terms K, there exists a set%(K) such that, for any subterm-compatible order /:

(1) for any ' " &K , µK,(% (%" +%(K)) = µROM,%(%" +%(K));

(2) µK,(% (! \%(K)) = µROM,%(! \%(K)), and both quantities are negligible as

functions of +.

3.4 Computing probabilities

In Appendix B we show that the probability measure µROM can be equivalentlydefined algebraically. This algebraic definition reduces the problem of computingprobabilities of the form

PµROM[t1 " B1, . . . , tn " Bn, t

"1 = t""1 , . . . , t

"n# = t""n## ] (1)

(where B1, . . . , Bn are sets of bitstrings) to computing the sizes of intersectionsof sets in {B1, . . . , Bn}$T . A full specification of the interpretations of types isnot necessary.

Our prototype implementation computes probabilities of the form (1) forthe cryptographic primitives and respective properties considered in our run-ning example. The user must, however, specify the sizes of intersections ofthe sets of bitstrings B1, . . . , Bn with the specified property types. Let T ={t1, . . . , tn, t"1, t""1 , . . . , t"n# , t"n##}. Because we must consider &R-closed partitionsof the set TW

! + sub[T ] of weak subterms of T , the complexity of the computa-tion is exponential in the number of such weak subterms. However, for the setupspecification considered in our running example, if T contains no subterms ofthe form "i(t) for some i " {1, 2} and some term t such that head(t) *= '·, ·(, thecomplexity is linear in the number of non-weak subterms of T .

4 O!-line guessing

In this section we describe how properties of cryptographic primitives describedin our setup specification S can be used to find and estimate the success proba-bility of non-trivial o!-line guessing attacks.

Suppose that a protocol should keep a term s secret, and that there is asmall set B , B$ of bitstrings such that s represents a bitstring in this set. Thismay be the case, for instance, if s is computed from a password chosen by ahuman, or if s is the hash of a secret term. If an attacker can feasibly enumerateall bitstrings b " B, he may use his knowledge to try to rule out the posibilitythat s represents each bitstring b. Ultimately, the attacker’s goal is to learn sby excluding all but one bitstring from the set B. In this way an attacker canlearn the term s even if he can not directly deduce it by constructing terms andreasoning equationally. This strategy is called o!-line guessing, as the attackerneed not interact with agents to verify his guess.

4.1 Attacker model

We will assume fixed an infinite set N ! !0, such that !0 \ N is finite. In-tuitively, N contains the symbols representing random data generated by theagents, whereas !0 \ N contains the symbols in the signature that representcryptographically relevant constants (for instance, the bitstring 0 when model-ing XOR).

A substitution is a partial function (:V " T! . As usual, we abuse notationby using the same symbol ( for a substitution and its homomorphic extensionto T!(X), and write t( instead of ((t).

We represent an attacker’s knowledge by a frame [35], which is a pair (n,(),written ,n.(, where n ! N is a finite set of names and (:V " T! is a sub-stitution. Intuitively, names in n represent fresh data randomly generated byhonest agents and unknown to the attacker, and terms in the range of ( rep-resent messages learned by the attacker, for instance by eavesdropping on thenetwork. Given a frame - = ,n.(, we define T& = T!\n(dom(()). We say thatterms in T& are --recipes. Such terms represent the ways that an attacker canbuild terms using his knowledge. A term t can be constructed from - if there isa --recipe . such that .( = t. The set of terms constructible from - is ([T&].

Suppose that an attacker whose knowledge is represented by a frame - =,n.( tries to mount an o!-line guessing attack of a secret term s. We requirethat the set of bitstrings tried by the attacker is !type(w)" for some w " Nthat does not occur in either n or (, and we model the attacker’s guess by w.Letting x /" dom(() be a fresh variable, we consider the frames -s = ,nx.(s and-w = ,nx.(w, where nx = n $ {x}, (s = ( $ {x 2% s}, and (w = ( $ {x 2% w}.Here, -s represents the attacker’s knowledge using the right guess, while -w

represents his knowledge when his guess is wrong.

Guess verifiers. We consider two ways in which an attacker can verify whetherhis guess w is correct. First, he can use his guess to construct a pair of terms

(t, t") that are equal under &R if w = s, but di!erent if w *= s. This is the usualdefinition used in symbolic methods and has been studied using the standardnotion of static equivalence [25,27,35]. Second, he can use his guess to constructa term t whose corresponding bitstring satisfies some given property if w = s,and not necessarily otherwise.

To reason about the first of these strategies, we use the standard notion ofstatic equivalence. We say that the pairs of terms used by the attacker in suchtests are equational verifiers. To define this notion precisely, we use the notionof subterm at position p: Given a term t and p " N$, we denote the subtermof t at position p by t|p, where t|' = t and, for t = f(t1, . . . , tn), t|i.p = ti|p fori " {1, . . . , n}, where i.p denotes the sequence of integers obtained by prependingi to the sequence p.

Definition 1. The set eqv(-, t) of equational verifiers of a term t (under -) isthe set of pairs (t, t") such that t, t" " T&s , t(s &R t"(s, t(w *&R t"(w, and thereis no p " N$ \ {/} such that these conditions hold for the pair (t|p, t"|p).

To model the second attacking strategy, we will consider a set T T of testtypes, which model the attacker’s ability to test whether a bitstring is in a givenset. Thus, to model a realistic attacker it is important to choose test types suchthat, whenever T " T T , it is computationally feasible to check whether a givenbitstring is in !T ".

Example 10. We will consider the following test types

– odd, with !odd" corresponding to the set of 1024-bit bitstrings that representan odd number, so that |!odd"| = 21023;

– nspf, with !nspf" corresponding to the set of 1024-bit bitstrings representingnumbers with no prime factors smaller than 106. We have |!nspf"| & 21024/24.

These test types are used to model o!-line guessing attacks in Section 4.2.

Definition 2. The set tv(-, t) of type verifiers of t (under -) is the set of pairs(t, TT ) such that t " T&s , T " T T , PµROM

[)t(s " !TT "] = 1, PµROM[*t(w "

!TT "] *= 1, and there are no p " N$ \ {/}, TT " " T T such that these conditionshold for (t|p, TT ").

Our requirements on the sub-positions of verifiers prevent us from havinginfinite sets of spurious verifiers. For instance, let h0(t) = t and hn+1(t) =h(hn(t)) for each n " N, and suppose that (t, t") is an equational verifier. Then,without our requirement on the sub-positions, so would be all pairs of the form(hi(t), hi(t")) for i " N. However, if an attacker tests whether t(w and t"(w

correspond to the same bitstring, there is no more information to be gained bytesting whether the same holds for hi(t)(w and hi(t")(w.

Computing equational verifiers is closely related to deciding static equiv-alence. Indeed, the algorithm for deciding static equivalence presented in [36]e!ectively computes equational verifiers as part of the decision procedure. Com-puting type verifiers is a less direct extension of existing algorithms. Nevertheless,

it is simple to check that, for any type verifier (t, T ), some rewrite rule must ap-ply to t(s and not to t(w, since otherwise the normal form of t(s is simply thenormal form of t(w with all occurrences of w replaced by s, and therefore thetwo terms have the same types. Computing recipes t that have this property(and such that none of their proper subterms have this property) is also part ofthe ordinary execution of algorithms for static equivalence.

4.2 O"-line guessing examples

We now present several examples of o!-line guessing attacks. These examplesillustrate that such attacks can result from implementation details that, whileoften trivial, are outside the scope of traditional symbolic methods. We show howsuch details can be modeled in our framework and used to estimate the prob-ability of attacks. All probability calculations in this section rely on the setupspecification described in our running example and are performed automaticallyby our implementation.

Example 11 (Attack on a stored password hash). This simple example considersan authentication server that stores password hashes instead of the users’ pass-words themselves. Let s " !0 be a weak password (i.e., type(s) = pw). Supposethat an attacker obtains its hash h(s) and wants to use it to o!-line guess s. Theattacker’s knowledge is represented by - = ,n.( = , {s} . {x1 2% h(s)} .

To analyze o!-line guessing in our framework, consider the frames -s =, {s, w} . {x1 2% h(s), x 2% s} and -w = , {s, w} . {x1 2% h(s), x 2% w} . Here, theset of type verifiers is empty (tv(-, s) = #), and the set of equational verifiers iseqv(-, s) = {(x1, h(x)} .

Recall that !pw" ! B256 and (h[B256] ! B256) " PS. Thus, we expect thateach wrong guess satisfies the equation h(s) = h(w) with probability 2#256. Since|!pw"| & 224, there are 224 - 1 wrong guesses to consider. Hence, the expected

number of guesses w satisfying h(w) = h(s) is 1 + 224#12256 , and we obtain an

estimated probability of success of 1

1+ 224$12256

& 11+ 1

2232.

Example 12. The EKE (Encrypted Key Exchange) protocol is designed to allowtwo parties to exchange authenticated information using a weak symmetric keywithout allowing o!-line guessing attacks. It is known that the redundancy ofRSA public keys can be exploited to mount o!-line guessing attacks on thisprotocol [29]. We show now how our methods can be used to estimate the successprobability of this o!-line guessing attack.

For representing these attacks, it is su"cient to consider the first step ofthe protocol. Let A and B be agents sharing a weak password s " !0, withtype(s) = pw. For the first message, A randomly samples a bitstring from!random", represented by a term r " !0 such that type(r) = random. After-wards, she uses it to compute an RSA public key 'mod(r), expn(r)(. Then, A(symmetrically) encrypts this public key with the shared password s and sendsthe encryption to B. To keep our analysis simple, we assume that the partici-pants encrypt the modulus and the exponent separately and send them over the

network as a pair of encryptions (instead of the encryption of the pair). Thus,this first message is represented by the term '{|mod(r)|}s , {|expn(r)|}s(. See [29]for a full description of the protocol.

After observing this message in the network, the attacker’s knowledge isdescribed by the frame - = ,n.(, where ( = {x1 2% '{|mod(r)|}s , {|expn(r)|}s(}and n = {r}. The relevant frames for the analysis of o!-line guessing attacks are-s = ,nw.(s and -w = ,nw.(w, where nw = n $ {w}, (s = ( $ {x2 2% s}, and(w = ( $ {x2 2% w}.

In this case there are no equational verifiers: eqv(-, s) = #. However, whileit may be infeasible to check whether the modulus is indeed the product of twolarge prime factors, an attacker can nevertheless use his guess w to decrypt thepair sent by A and test whether the resulting modulus has small prime factorsand whether the exponent e is odd. Thus,

tv(-, s) =#({|"1(x1)|}#1

x2, nspf), ({|"2(x1)|}#1

x2, odd)

$.

Each wrong guess satisfies the two type verifiers with probability

+#{|"1(x1)|}#1

x2" !nspf" , #{|"2(x1)|}#1

x2" !odd"

,& 1

48.

Since there are 224 - 1 wrong guesses, we estimate the probability of success ofthis o!-line guessing attack as described above to be 1

1+(224#1)/48 & 2#18.5.

Example 13. Consider now the same setup as in Example 12, except that onlythe exponent of the RSA public key is encrypted in the first message. The authorsof EKE note [29] that the protocol is still vulnerable to o!-line guessing attacks:Since the exponent of an RSA key is always odd, one can decrypt each encryptionof a public key with each guess. For the right guess, decrypting each encryptionwill yield an odd exponent. The probability that a wrong guess achieves thisdecreases exponentially with the number of encryptions available to the attacker.

To formalize this in our setting, we let - = ,n.( be the frame representing theattacker’s knowledge, where ( = {xi 2% 'mod(ri), {|expn(ri)|}s( | i " {1, . . . , n}}and n = {r1, . . . , rn, s} . The frames -s and -w used are as expected: -s =,nw.(s and -w = ,nw.(w, where nw = n $ {w}, (s = ( $ {xn+1 2% s}, and(w = ( $ {xn+1 2% w}.

As before, there are no equational verifiers: eqv(-, s) = #. The set of type

verifiers is given by tv(-, s) =#({|"2(xi)|}#1

xn+1, odd) | i " {1, . . . , n}

$. As in Ex-

ample 12, we obtain 1

1+ 224$12n

= 2n

2n+224#1 as an estimate for the probability of

success of this o!-line guessing attack.

5 Conclusion

We presented a symbolic, automatable probabilistic framework for security pro-tocol analysis. Our framework allows one to express properties of cryptographic

primitives beyond standard equational properties, thereby modeling a strongerattacker than in the standard Dolev-Yao model. We illustrated the usefulness ofthis approach by modeling non-trivial properties of RSA encryption and usingthem to analyze o!-line guessing attacks on the EKE protocol, currently outsidethe scope of other symbolic methods.

We have proposed a probability distribution based on interpreting functionsas random oracles subject to satisfying the properties of cryptographic primitivesdescribed in our setup. This is a non-trivial generalization of the random oraclemodel. By using this probability distribution, we can (automatically) reasonabout an attack’s success probability. We provide a prototype implementationof our methods, which computes probabilities in our formalization of a Dolev-Yao attacker using RSA asymmetric encryption. Our implementation is availableat [37].

More generally, our approach can be used to analyze a broad range of at-tacks and weaknesses of cryptographic primitives that could not previously beanalyzed by symbolic models. These include some forms of cryptanalysis (suchas di!erential cryptanalysis to AES, DES or hash functions, as in [38]) andside-channel attacks [33]. Short-string authentication, used in device pairing pro-tocols [39], and distance-bounding protocols relying on rapid-bit exchange, suchas [40], Both are ill-suited for analysis with existing symbolic methods as theiranalysis is intrinsically probabilistic.

As future work, we plan to integrate this approach with a symbolic protocolmodel-checker capable of generating protocol execution traces and the probabil-ities relevant for deciding whether a trace allows an attack. In the case of o!-lineguessing, this amounts to computing the sets of equational and type verifiers, atask closely related to that of deciding static equivalence. Since our probabilisticanalysis can be performed automatically (as illustrated by our prototype), thisallows our analysis to be fully automated. We expect that such an approach willallow us to find numerous new protocol attacks which depend on properties ofthe cryptographic primitives used.

References

1. V. Cortier, S. Delaune, and P. Lafourcade, “A survey of algebraic properties usedin cryptographic protocols,” J. Comput. Secur., vol. 14, pp. 1–43, January 2006.

2. B. Blanchet, “An e"cient cryptographic protocol verifier based on prolog rules,” inProceedings of the 14th IEEE workshop on Computer Security Foundations, CSFW’01, (Washington, DC, USA), pp. 82–96, IEEE Computer Society, 2001.

3. A. Armando, D. Basin, Y. Boichut, Y. Chevalier, L. Compagna, J. Cuellar, P. Han-kes Drielsma, P.-C. Heam, J. Mantovani, S. Modersheim, D. von Oheimb, M. Rusi-nowitch, J. Santiago, M. Turuani, L. Vigano, and L. Vigneron, “The AVISPA Toolfor the Automated Validation of Internet Security Protocols and Applications,” inProceedings of the 17th International Conference on Computer Aided Verification(CAV’05) (K. Etessami and S. K. Rajamani, eds.), vol. 3576 of LNCS, Springer,2005.

4. S. Goldwasser and S. Micali, “Probabilistic encryption,” J. Comput. Syst. Sci.,vol. 28, no. 2, pp. 270–299, 1984.

5. M. Bellare and P. Rogaway, “Entity authentication and key distribution,” inCRYPTO (D. R. Stinson, ed.), vol. 773 of Lecture Notes in Computer Science,pp. 232–249, Springer, 1993.

6. B. Warinschi, “A computational analysis of the Needham-Schrder-(Lwe) protocol,”Journal of Computer Security, vol. 13, no. 3, pp. 565–591, 2005.

7. V. Cortier, S. Kremer, and B. Warinschi, “A survey of symbolic methods in compu-tational analysis of cryptographic systems,” J. Autom. Reasoning, vol. 46, no. 3-4,pp. 225–259, 2011.

8. B. Blanchet, “Security protocol verification: Symbolic and computational models,”in Degano and Guttman [41], pp. 3–29.

9. M. Abadi and P. Rogaway, “Reconciling two views of cryptography (the compu-tational soundness of formal encryption),” J. Cryptology, vol. 20, no. 3, p. 395,2007.

10. M. Backes, A. Malik, and D. Unruh, “Computational soundness without proto-col restrictions,” in ACM Conference on Computer and Communications Security(T. Yu, G. Danezis, and V. D. Gligor, eds.), pp. 699–711, ACM, 2012.

11. M. Backes, B. Pfitzmann, and M. Waidner, “A composable cryptographic librarywith nested operations,” in ACM Conference on Computer and CommunicationsSecurity (S. Jajodia, V. Atluri, and T. Jaeger, eds.), pp. 220–230, ACM, 2003.

12. H. Comon-Lundh and V. Cortier, “Computational soundness of observationalequivalence,” in ACM Conference on Computer and Communications Security(P. Ning, P. F. Syverson, and S. Jha, eds.), pp. 109–118, ACM, 2008.

13. P. Adao, G. Bana, J. Herzog, and A. Scedrov, “Soundness and completeness of for-mal encryption: The cases of key cycles and partial information leakage,” Journalof Computer Security, vol. 17, no. 5, pp. 737–797, 2009.

14. P. Laud and R. Corin, “Sound computational interpretation of formal encryptionwith composed keys,” in ICISC (J. I. Lim and D. H. Lee, eds.), vol. 2971 of LectureNotes in Computer Science, pp. 55–66, Springer, 2003.

15. V. Cortier, S. Kremer, R. Kusters, and B. Warinschi, “Computationally sound sym-bolic secrecy in the presence of hash functions,” IACR Cryptology ePrint Archive,vol. 2006, p. 218, 2006.

16. S. Halevi, “A plausible approach to computer-aided cryptographic proofs,” IACRCryptology ePrint Archive, vol. 2005, p. 181, 2005.

17. M. Backes and P. Laud, “Computationally sound secrecy proofs by mechanizedflow analysis,” in ACM Conference on Computer and Communications Security(A. Juels, R. N. Wright, and S. D. C. di Vimercati, eds.), pp. 370–379, ACM,2006.

18. B. Blanchet and D. Pointcheval, “Automated security proofs with sequences ofgames,” in CRYPTO (C. Dwork, ed.), vol. 4117 of Lecture Notes in ComputerScience, pp. 537–554, Springer, 2006.

19. B. Blanchet, “A computationally sound mechanized prover for security protocols,”IEEE Trans. Dependable Sec. Comput., vol. 5, no. 4, pp. 193–207, 2008.

20. G. Barthe, B. Gregoire, and S. Z. Beguelin, “Formal certification of code-basedcryptographic proofs,” in POPL (Z. Shao and B. C. Pierce, eds.), pp. 90–101,ACM, 2009.

21. G. Barthe, J. M. Crespo, B. Gregoire, C. Kunz, and S. Zanella Beguelin,“Computer-aided cryptographic proofs,” in 3rd International Conference on In-teractive Theorem Proving, ITP 2012, pp. 12–27, Springer, 2012.

22. G. Barthe, M. Daubignard, B. M. Kapron, and Y. Lakhnech, “Computationalindistinguishability logic,” in ACM Conference on Computer and Communications

Security (E. Al-Shaer, A. D. Keromytis, and V. Shmatikov, eds.), pp. 375–386,ACM, 2010.

23. G. Bana and H. Comon-Lundh, “Towards unconditional soundness: Computation-ally complete symbolic attacker,” in Degano and Guttman [41], pp. 189–208.

24. R. Corin, J. Doumen, and S. Etalle, “Analysing password protocol security againsto!-line dictionary attacks,” Electron. Notes Theor. Comput. Sci., vol. 121, pp. 47–63, February 2005.

25. M. Baudet, “Deciding security of protocols against o!-line guessing attacks,” inProceedings of the 12th ACM conference on Computer and communications secu-rity, CCS ’05, (New York, NY, USA), pp. 16–25, ACM, 2005.

26. Z. Li and W. Wang, “Rethinking about guessing attacks,” in ASIACCS (B. S. N.Cheung, L. C. K. Hui, R. S. Sandhu, and D. S. Wong, eds.), pp. 316–325, ACM,2011.

27. M. Abadi, M. Baudet, and B. Warinschi, “Guessing attacks and the computationalsoundness of static equivalence,” Journal of Computer Security, pp. 909–968, De-cember 2010.

28. “Sectools.org: Top 125 network security tools,” Jan. 2013.29. S. M. Bellovin and M. Merritt, “Encrypted Key Exchange: Password-based proto-

cols secure against dictionary attacks,” in IEEE SYMPOSIUM ON RESEARCHIN SECURITY AND PRIVACY, pp. 72–84, 1992.

30. J. Munilla and A. Peinado, “O!-line password-guessing attack to peyravian-je!ries’s remote user authentication protocol,” Computer Communications, vol. 30,no. 1, pp. 52–54, 2006.

31. S. Halevi and H. Krawczyk, “Public-key cryptography and password protocols,”ACM Trans. Inf. Syst. Secur., vol. 2, no. 3, pp. 230–268, 1999.

32. M. Abadi and B. Warinschi, “Password-based encryption analyzed,” in ICALP(L. Caires, G. F. Italiano, L. Monteiro, C. Palamidessi, and M. Yung, eds.),vol. 3580 of Lecture Notes in Computer Science, pp. 664–676, Springer, 2005.

33. B. Kopf and D. A. Basin, “An information-theoretic model for adaptive side-channel attacks,” in ACM Conference on Computer and Communications Security(P. Ning, S. D. C. di Vimercati, and P. F. Syverson, eds.), pp. 286–296, ACM,2007.

34. M. Abadi and V. Cortier, “Deciding knowledge in security protocols under equa-tional theories,” Theor. Comput. Sci., vol. 367, pp. 2–32, November 2006.

35. M. Abadi and C. Fournet, “Mobile values, new names, and secure communication,”in Proc. of the 28th ACM Symp. on Principles of Programming Languages, POPL’01, (New York, NY, USA), pp. 104–115, ACM, 2001.

36. B. Conchinha, D. A. Basin, and C. Caleiro, “FAST: An e"cient decision procedurefor deduction and static equivalence,” in RTA (M. Schmidt-Schauß, ed.), vol. 10of LIPIcs, pp. 11–20, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2011.

37. “http://www.infsec.ethz.ch/people/brunoco,” 2013.38. B. Montalto and C. Caleiro, “Modeling and reasoning about an attacker with

cryptanalytical capabilities,” Electr. Notes Theor. Comput. Sci., vol. 253, no. 3,pp. 143–165, 2009.

39. S. Laur and K. Nyberg, “E"cient mutual data authentication using manuallyauthenticated strings,” in CANS (D. Pointcheval, Y. Mu, and K. Chen, eds.),vol. 4301 of Lecture Notes in Computer Science, pp. 90–107, Springer, 2006.

40. J. Munilla and A. Peinado, “Distance bounding protocols for RFID enhanced byusing void-challenges and analysis in noisy channels,” Wireless Communicationsand Mobile Computing, vol. 8, no. 9, pp. 1227–1232, 2008.

41. P. Degano and J. D. Guttman, eds., Principles of Security and Trust - First In-ternational Conference, POST 2012, Held as Part of the European Joint Confer-ences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March24 - April 1, 2012, Proceedings, vol. 7215 of Lecture Notes in Computer Science,Springer, 2012.

Appendix A

For each finite set of terms K, let 0K be the set of functions ):K % B$'. Let 0 ="

K!Pfin(T!) 0K , where Pfin(X) represents the set of finite subsets of X. For each

) " 0 , let %$ = {$ " ! | t " dom()) 4 $(t) = )(t)}. Let %( = {%$ | ) " 0}and, for each K, %(K = {%$ | ) " 0K}. Furthermore, for each ' " &K , define0(') = {) " 0K | t " K 4 )(t) " '(t)}. We remark that %" =

!$!((") %$,

and that 0(') is the only subset of 0K such that the above equality holds.

Proof (Theorem 2). We first note that if ','" " &sub[K] are distinct, then %" *=%"# . Therefore, µK,( is well-defined.

Letting 'B!"= {t 2% B$

' | t " sub[K]}, we have %"B!"

" FK and ! = %"B!".

It is clear that µK,((%"B!") = 1.

Now, % " FK can be written as % =!

$!( %$ for some countable subset 0of 0K . Sets of the form

!$!( # %$ for some countable subset 0 " of 0K are closed

under complementation and countable unions. Conversely, for each ) " 0sub[K],let '$: sub[K] % P(B$

') be defined by '$(t) = {)(t)} for each t " K. It is clearthat %"# = %$, and thus %$ " %# ! F for all ) " 0K .

We conclude that, for all % " !, % " FK if and only if there exists a sequenceof distinct )1, . . . ,)k " 0sub[K] (with k " N $ {5}) such that % =

!ki=1 %$i .

Thus, for any sequence {%i}i!N such that %i " FK for all i " N and %i+%j = #whenever i *= j, there exist ki " N${5} and )i,j " 0sub[K] (with j " {1, . . . , ki})such that %i =

!ki

j=1 %$i,j and!

i!N %i =!

i!N!ki

j=1 %$i,j . The definition of

µK,( implies that

µK,(

-.

i!N%i

/= µK,(

0

1.

i!N

ki.

j=1

%$i,j

2

3 =4

i!N

ki4

j=1

µ(%$i,j ),

and, for each i " N,

µK,((%i) = µK,(

0

1ki.

j=1

%$i,j

2

3 =ki4

j=1

µ(%$i,j ).

Thus, µK,( is (-additive:

µK,(

-.

i!N%i

/=

4

i!N

ki4

j=1

µ(%$i,j ) =4

i!Nµ(%i),

concluding the proof. 67

In the following, we assume that K is a finite set of terms, P " PWR (K),

/ is a subterm-compatible order, and Np is an infinite set of names (disjointfrom !). We say that a P -renaming is an injective function 1 :P ! Np mappingP -equivalence classes to names in Np. We will denote by P $ be the partition of

(TW! + sub[K]) $ 1 [P ] given by P $ = {p $ {1(p)} | p " P} . Moreover, we define

2)P :T!(Np) % T!(Np) inductively by 2)

P (t) = 1(p) if there is p " P $ and t" " psuch that t" &P! t and 2)

P (f(t1, . . . , tn)) = f(2(t1), . . . ,2(tn))8 otherwise. Wenote that, for all t " T!(Np), 2)

P (2)P (t)) = 2)

P (t) and 2)P (t) &P! t. We can drop

both 1 and P when they are clear from context.

Lemma 1. For all t, t" " T!, all finite sets of terms K, all P " PWR (K), and

all P -renamings 1 , we have t &P t" if and only if 2)P (t) = 2)

P (t").

Proof. Let 9 be the relation on terms given by t 9 t" if and only if 2(t) = 2(t").Since t, t" " T! , we have t &P t" if and only if t &P! t"; therefore, it is su"cientto prove that 9 = &P! . Because t &P! 2(t), it is clear that 9 ! &P! .

It remains to prove that &P!!9. It is su"cient to show:

(1) 9 is a congruence relation;(2) &R ! 9;(3) if p " P and t, t" " p, then t 9 t".

(1) It is clear that 9 is reflexive, symmetric and transitive. Suppose that t1 9 t"1,. . . , tn 9 t"n and f " !n. Let t = f(t1, . . . , tn) and t" = f(t"1, . . . , t

"n). If there is

p " P and t"" " p such that t &P t"", then, because 9 ! &P! , we have t" &P!

t &P! t""; and because t, t", t"" " T! , we have t" &P t"". Thus, 2(t) = 1(p) = 2(t"),and t 9 t". Otherwise, we have

2(t) = f(2(t1), . . . ,2(tn)) = f(2(t"1), . . . ,2(t"n)) = 2(t").

Thus, 9$ is a congruence relation.

(2) In light of (1), it is su"cient to show that t 9 t8 for all t. We note that, forevery term t = f(t1, . . . , tn), either 2(t) = f(2(t1), . . . ,2(tn))8 or 2(t) = 1(p) forsome p " P . Therefore, 2(t) is in normal form for all t.

We prove the result by induction on t. If t = c() for some c " !0, then t = t8,and the result is clear. Otherwise, let t = f(t1, . . . , tn), with n > 0.

If there is p " P and t" " p such that t &P t", we also have t 8 &P t".Therefore, 2(t) = 2(t") = 1(p), and thus t 9 t8.

If this is not the case, then 2(t) = f(2(t1), . . . ,2(tn))8. By the inductionhypothesis, 2(t1) = 2(t18), . . . , 2(tn) = 2(tn8). If f(t18, . . . , tn8) is in normalform, then t8 = f(t18, . . . , tn8), and

2(t) = f(2(t1), . . . ,2(tn))8= f(2(t18), . . . ,2(tn8))8= 2(f(t18, . . . , tn8)) = 2(t8).

Suppose then that f(t18, . . . , tn8) is not in normal form. Then, there is (l %r) " R and a substitution ( such that f(t18, . . . , tn8) = l(. Let (*: dom(() %T!(Np) be the substitution such that, for each x " dom((), x(* is obtainedfrom x( by taking each outermost subterm s of x( for which there is p " Pand t" " p such that s &P t", and replacing s by 1(p). Each proper subterm off(t18, . . . , tn8) is in normal form, and thus, recalling that weak function symbolsdo not occur in R, we have s " TW

! for all such subterms s. Therefore, x(* is

obtained from x( by replacing subterms of x( whose head symbols are in !W bythe corresponding names in Np. By a similar reasoning, for each i " {1, . . . , n},we can obtain 2(ti8) by taking the outermost subterms s of ti8 for which there isp " P and t" " p such that s &P t", replacing them with their corresponding name1(p), and taking the normal form of the resulting term. Thus, 2(ti8) = (ti(*)8.Noting again that such subterms s must have head symbols in !W which do notoccur in R and using the fact that %R is convergent, we conclude that

2(t) = f(2(t18), . . . ,2(tn8))8= f((t1(*)8, . . . , (tn(*)8)8= l(*8 .

Similarly, r(* 8= 2(r(). Furthermore, r( is a proper subset of l(, and thusr( = (r()8= t8. We conclude that

2(t) = f(2(t1), . . . ,2(tn))8= f(2(t18), . . . ,2(tn8))8

= (l(*)8= (r(*)8= 2(r() = 2(t8),

concluding the proof.

(3) Whenever p " P and t " p, we have 2(t) = 1(p). Property (3) follows. 67

If K is a finite set of terms and P " PWR (K), a P -renaming is an injective

function 1 :P % N+.

Algorithm 7 (P,R-renaming)Input: K,P " PW

R (K), a subterm-compatible order /, and a P -renaming 1Output: functions 1+, 1$: sub[K] % T!(Np)

1: 1+, 1$ 0 #2: let t1, . . . , tk be such that t1 / . . . / tk and sub[K] = {t1, . . . , tk}3: for i from 1 to k4: let ti = f(t"1, . . . , t

"n)

5: 1+(ti) 0 f(1$(t"1), . . . , 1$(t"n))8

6: if there is j 3 i s.t. 1+(tj) = 1+(ti) and tj " p for some p " P7: 1$(ti) 0 1(p)8: else 1$(ti) 0 1+(ti)9: return 1+, 1$

Lemma 2. Consider the function 1$ output by Algorithm 7 on input (K,P,/, 1). For all t " sub[K], we have 1$(t) = 2)

P (t).

Proof. The proof is by induction on |sub[K]|. The result is clear if |sub[K]| =0. Suppose then that |sub[K]| = n + 1, and that 1$(ti) = 2(ti) for all i "{1, . . . , n}, where sub[K] = {t1, . . . , tn+1} and t1 / . . . / tn+1. Let tn+1 =fn+1(tn+1,1, . . . , tn+1,kn+1).

If there is j 3 n + 1 and p " P such that tj " p and 1+(tj) = 1+(tn+1),then tj &P tn+1. This is because, letting tj = fj(tj,1, . . . , tj,kj ) and using theinduction hypothesis, we have

1+(tj) = fj(1$(tj,1), . . . , 1

$(tj,kj ))8 &P! tj ,

and, similarly,

1+(tn+1) = fn+1(1$(tn+1,1), . . . , 1

$(tn+1,kn+1))8 &P! tn+1.

This implies that tj &P! tn+1, and because tj , tn+1 " T! , we have tj &P tn+1.Thus, we have 2(tn+1) = 1$(tn+1) = 1(p).

Suppose then that there is no p " P and t " p such that tn+1 &P t. Thereasoning above implies that there is no j 3 n + 1 and p " P such that tj " pand 1+(tj) = 1+(tn+1). Thus,

1$(tn+1) = 1+(tn+1) = fn+1(1$(tn+1,1), . . . , 1

$(tn+1,kn+1))8

= fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8 = 2(tn+1),

using the induction hypothesis.It remains to consider the case that there is p " P and t " p such that

tn+1 &P t. In this case, we have 2(tn+1) = 1(p); therefore, we have to prove that1$(tn+1) = 1(p) as well. Because t " sub[K], we have t = tj for some j 3 n+ 1.Letting tn+1 = fn+1(tn+1,1, . . . , tn+1,kn+1), the induction hypothesis yields

1+(tn+1) = fn+1(1$(tn+1,1), . . . , 1$(tn+1,kn+1))8= fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8 .

Letting tj = fj(tj,1, . . . , tj,kj ) and using the induction hypothesis again, we have

1+(tj) = fj(1$(tj,1), . . . , 1$(tj,kj ))8= fj(2(tj,1), . . . ,2(tj,kj ))8= fj(2(tj,1), . . . ,2(tj,kj )),

because 2(t) is in normal form for all t and fj " !W does not occur in R.Combining the fact that 1+(t) &P! t for all t with the two equalities above, itfollows that

fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8 &P! fj(2(tj,1), . . . ,2(tj,kj )).

Furthermore, 2(t) is always in normal form: therefore, we have either

fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8 = fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))

or

fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8 " sub(fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))).

In the latter case, we note that, whenever s " sub(2(t)), we have 2(s) = s:Therefore, we have

fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8= 2(fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8),

and because tn+1 &P tj and tj " p, we have

1+(tn+1) = 2(fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8) = 1(p),

as desired.Let us now consider the case that

fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1))8 = fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1)).

If fn+1 /" !W , then fn+1(2(tn+1,1), . . . ,2(tn+1,kn+1)) *&P! tj , since head(tj) "!W ; it follows that tn+1 *&P! tj , a contradiction. Therefore, we must have fn+1 "!W . In this case, tn+1 " TW

! , and because P is &R-closed, we have tn+1 " pand 1$(tn+1) = 1(p). 67

If K is a finite set of terms and * " I(K), we will write $ |= * to denote that $satisfies *. If ) " 0sub[K] for some finite set of terms K, we say that ) satisfies *,

and write ) |= *, if, for all t = f(t1, . . . , tn) " K, ()(t1), . . . ,)(tn)) " !dom(*(t))"and )(t) " !ran(*(t))".

Lemma 3. Let K be a finite set of terms and ) " 0sub[K]. Then, there exists

at most one * " I(K) such that ) |= *.

Proof. For each t " sub[K], letting t = f(t1, . . . , tn), there is at most one ps "PSf such that ()(t1), . . . ,)(tn)) " !dom(ps)". If ) |= *, we must have *(t) = psin this case, and *(t) = . if no such ps exists; thus, there is at most one * thatis satisfied by ). 67

In light of Lemma 3, if K = sub[K] and ) " 0K , we will write *()) for theonly * " I(K) such that ) |= *, and set *()) = . if no such * exists. Moreover,if ): sub[K] % B$

', we say that ) satisfies &W ($) if, whenever t, t" " sub[K] are

such that t &W ($) t", we have either )(t) = )(t"), )(t) = ., or )(t") = ..

Lemma 4. At any point of any execution of Algorithm 3, there is a finite setK " such that dom()ROM) = sub[K "]. Furthermore, )ROM satisfies &W ($ROM)

and exactly one selection function * " IS(K ").

Proof. Consider an execution of Algorithm 3 for input K and using a subterm-compatible order /. Let t1, . . . , tn be such that sub[K] = {t1, . . . , tn} and t1 /. . . / tn. Algorithm 3 samples )ROM(t1), . . . ,)ROM(tn), in this order. We provethe result on the execution of Algorithm 3. For all i " {0, . . . , n}, let Ki ={t1, . . . , ti} and let )ROM,i be the function )ROM used in the algorithm after thei-th execution of the cycle in lines 3-13, so that dom()ROM,i) = Ki.

In the beginning of the execution, we have K0 = dom()ROM) = # andIS(K0) = {#}. Thus, the result is clear. Suppose then that the result holdsfor j " {1, . . . , i}, and consider the i+ 1-th execution of the cycle.

To prove that )ROM,i+1 satisfies &W ($ROM,i+1), let k, k" be indexes such

that tk &W ($ROM,i+1)tk# . Note that, if i + 1 " {k, k"}, then lines 5 and 9

of the algorithm description imply the result. There are two cases: either (1)tk &W,$ROM,i

tk# , or (2) not. In case (2), lemmas 1 and 2 imply that i+1 " {k, k"},by noting that 1$(tk) and 1$(tk#) do not depend on ti+1. Therefore, the re-sult holds. Otherwise, either i + 1 " {k, k"}, and the result holds as before, or

i + 1 /" {k, k"}, and the result is implied by the induction hypothesis (since)ROM,i+1(tm) = )ROM,i(tm) whenever 1 3 m 3 i).

Furthermore, we have psub(ti+1) ! Ki (because / is a subterm-compatibleorder), and the induction hypothesis yields

sub[Ki+1] = sub[Ki] $ {ti+1} = Ki $ {ti+1} = Ki+1.

It remains to prove that )ROM satisfies &W ($ROM). Lemma 3 implies that,for any j, there is at most one * " IS(Kj) such that )ROM,j satisfies *. Thus, itis su"cient to show that )ROM,i+1 satisfies some * " IS(Ki+1). By the induc-tion hypothesis, there exists * " IS(Ki) such that )ROM satisfies *. Let ti+1 =fi+1(t"i+1,1, . . . , t

"i+1,k). If ()ROM,i(t"i+1,1), . . . ,)ROM,i(t"i+1,k)) /" domS(fi+1), then

)ROM,i+1(ti+1) = . will be sampled to .. Then, letting *" = * $ {ti+1 2% .}, itis clear that )ROM,i+1 satisfies *" " IS(Ki+1).

Otherwise, there is ps " PSfi+1 such that

()ROM,i(t"i+1,1), . . . ,)ROM,i(t

"i+1,k)) " !dom(ps)" .

Let *" = * $ {ti+1 2% ps}; we have *" " IS(Ki+1), and it is su"cient to showthat )ROM,i+1 satisfies *". Let t" = f "(t"1, . . . , t

"k#) be the minimal (with respect

to /) term in Ki+1 such that t" &P ($ROM,i)t and

()ROM,i(t"1), . . . ,)ROM,i(t

"k#)) " domS(f

")

(note that we may have t" = ti+1). Then, we have )ROM,i+1(ti+1) = )ROM,i(t"),and )ROM,i(t") is sampled from !ran(*(t"))". The compatibility condition impliesthat, for all *"" " IS(Ki+1) and all t"" " Ki+1, !ran(*""(t"))" ! !ran(*""(t""))". Itfollows that !ran(*"(t"))" ! !ran(*"(ti+1)"; therefore,

)ROM,i+1(ti+1) = )ROM,i(t") " !ran(*"(t"))" ! !ran(*"(ti+1))" ,

and thus )ROM,i+1 satisfies *". 67

In the following, if t " sub[K] and / is a subterm-compatible order, wedenote by &tK,( the random variable representing )ROM(t), where )ROM is theoutput of Algorithm 3 when executed on input K using the order /.

Lemma 5. Let K be a finite set of terms, / be a subterm-compatible order, and): sub[K] % B$

'. Suppose that ) satisfies &W ($) and there is * " IS(K) such that) satisfies *. Let t" = f(t"1, . . . , t

"n) be a term such that t" /" K, psub(t") ! sub[K],

and t / t" for all t " K. Let b " B$', K

" = K $ {t"} and )" = ) $ {t" 2% b}.Define

P [K,)", t"] = P [&t"K#,(

= b | &tK#,( = )(t) for all t " sub[K]].

Consider the function 1+ output by Algorithm 7 for the input (K ", W ()"), /,1) for some W ()")-renaming 1 .

If there is no ps " PS such that head(ps) = head(t") and ()"(t"1), . . . ,)"(t"n)) "

dom(ps), then P [K,)", t"] = 1 if b = . and P [K,)", t"] = 0 otherwise.Otherwise, letting ps be the only property statement satisfying the condition

above, we have:

– if there is t " sub[K] such that either (1) 1+(t) = 1+(t") or (2) 1+(t") " N+

and 1$(t) = 1+(t"), then:

• if )(t) = b, then P [K,)", t"] = 1;• if )(t) *= b, then P [K,)", t"] = 0;

– if such a t does not exist, then P [K,)", t"] = 1/ |!ran(ps)"| if b " !ran(ps)"and P [K,)", t"] = 0 otherwise.

Proof. Consider an execution of Algorithm 3 on inputK${t"} using the order /.Then, P [K,)", t"] corresponds to the probability that such an execution outputsa function )ROM = )" given that, before the last step of the execution (when)ROM(t") is sampled), we have )ROM = ) (corresponding to the sampling of theterms in sub[K]). It is clear from the definition of the algorithm that, if thereis no ps " PS such that head(ps) = head(t") and ()"(t"1), . . . ,)

"(t"n)) " dom(ps),then )ROM(t") is sampled to ., and therefore P [K,)", t"] = 1 if b = . andP [K,)", t"] = 0 otherwise.

If there is ps " PS such that the above condition is satisfied and thereis t " sub[K] such that either (1) 1+(t) = 1+(t") or (2) 1+(t") " N+ and1$(t) = 1+(t"), we have t &W ($) t

". Therefore, )ROM(t") is sampled as )ROM(t") =)ROM(t), and the result follows.

Now we prove that, for any t, if t &W ($) t", then either (1) or (2) holds. If1+(t") " TW

! , then there exists a subterm s " sub[K] of t" such that s " TW! ,

t" &W ($) s, and 1$(s) = 1$(t"), and (1) holds. On the other hand, if 1+(t") /" TW! ,

then 1$(t") = 1+(t") (since we have head(t"8) = head(t) whenever t &P t" andt " TW

! ). By Lemma 1, if t &W ($) t", we have 1$(t) = 1$(t"). It follows that1$(t) /" TW

! , and thus 1+(t) = 1$(t) = 1$(t") = 1+(t"). Therefore, (2) holds.We conclude that if neither (1) nor (2) hold for t, then t *&W ($) t", and if

neither condition hold for any t " sub[K], then there is no t " dom()) suchthat t &W ($) t". In this case, )ROM(t") is sampled with uniform probabilitydistribution from !ran(ps)", according to the algorithm description, and it followsthat P [K,)", t"] = 1/ |!ran(ps)"| if b " !ran(ps)" and P [K,)", t"] = 0 otherwise,concluding the proof of the lemma. 67

In the following, if * is any selection function, we adopt the convention that*(.) = ., and ran(.) = ., !." = {.}. Thus, !ran(*(.))" = {.}.

Corollary 2. Let K be a finite set of terms, / be a subterm-compatible or-der, and ): sub[K] % B$

' be a function satisfying &W ($) and some * " IS(K).Consider the function 1+ output by Algorithm 7 for the input (K, W ()), /,1) for some W ())-renaming 1 . Let 1K : 1+[sub[K]] % sub[K] be such that, foreach t+ " 1+[sub[K]], 1K(t+) is the least t " sub[K] such that 1+(t) = t+ and*(t) *= . if such a t exists, and . otherwise. Let )ROM be the function output byAlgorithm 3 on input K and using the subterm-compatible order /. Then,

P [)ROM = )] =5

t+!)+[sub[K]]\N+

1

|!ran(*(1K(t+)))"| .

Proof. Let t1, . . . , tn be such that dom()) = {t1, . . . , tn} and t1 / . . . / tn. Wehave

P [)ROM = )]= P [)ROM(t1) = )(t1), . . . ,)ROM(tn) = )(tn)]

=6n

j=1 P [&tjK,(

= )(tj) | &t1K,(

= )(t1), . . . , *tj#1K,(

= )(tj#1)](2)

By Lemma 5, if ) does not satisfy &R or there is no * " IS(K) such that )satisfies *, then P [)ROM = )] = 0 (because one of the factors in the product is0). Therefore, it is su"cient to consider the case that ) satisfies &R and thereis * " IS(K) such that ) satisfies *. Then, for each j " {1, . . . , n}, either (1)there is k < j such that either (a) 1+(tj) = 1+(tk) or (b) 1+(tj) " N+ and1$(tk) = 1+(tj), or (2) there is no such k. In case (1), )ROM(tj) is sampled as)ROM(tj), and

P [&tjK

= )(tj) | &t1K

= )ROM(t1), . . . , *tj#1K

= )ROM(tj#1)] = 1.

In case (2), we have

P [&tjK

= )(tj) | &t1K

= )ROM(t1), . . . , *tj#1K

= )ROM(tj#1)] =1%%#ran(*(ti((,j)))

$%% .

The result then follows by combining equation (2) and Lemma 5. 67Lemma 6. Let K be a finite set of terms, /, /" be subterm-compatible orders,and ): sub[K] % B$

'. Suppose that ) satisfies &W ($) and there is * " IS(K)such that ) satisfies *. Consider the function 1+ output by Algorithm 7 on theinputs K,W ()). Define the functions 1K , 1 "K : 1+[sub[K]] % T!(N+) such that,for each t+ " 1+[sub[K]], 1K [t+] (respectively, 1 "K [t+]) is the least term t withrespect to / (respectively, /") such that 1+(t) = t+ and *(t) *= . if such a texists, and . otherwise. Then, for all t+ " 1+[sub[K]] \ N+,

#ran(*(1K(t+)))

$=

%ran(*(1K

#(t+)))

&.

Proof. Let t+ " 1+[sub[K]] \ N+. If *(t) = . for all t such that 1+(t) = t+, theresult holds since

#ran(*(1K(t+)))

$=

%ran(*(1K

#(t+)))

&= {.} .

Suppose then that this is not the case. By the compatibility condition, thereare tK , tK

# " sub(t+) such that tK &P |psub(t+)t+, tK

# &P |psub(t+)t+, and, for all

t" such that t" &P |psub(t+)t+, we have

#ran(*(tK))

$! !ran(*(t"))" and

%ran(*(tK

#))

&! !ran(*(t"))" .

Because tK &P |psub(t+)1K(t+) and t+ /" N+, we have 1+(tK) = t+, and thus

tK = 1K(t+) (since /,/" are subterm-compatible orders). Analogously, we havetK

#= 1K

#(t+). We conclude that

%ran(*(1K

#(t+)))

&!

#ran(*(1K(t+)))

$

and #ran(*(1K(t+)))

$!

%ran(*(1K

#(t+)))

&,

and the result follows. 67

Proof (Theorem 4). Let K = dom('), K " = dom('"). Let t1, . . . , tn be such thatsub[K +K "] = {t1, . . . , tn}. Because %" = %"# , we must have '(t) = '"(t) for allt " K +K ", '(t) = B$

' for all t " K \K ", and '"(t) = B$' for all t " K " \K.

For simplicity, we will write &tK (respectively &tK#) for the random variable

&tK,( (respectively &tK#,(#). The probability that Algorithm 3 on input K and /

outputs a function )ROM such that )ROM(t) " '(t) for all t " K depends onlyon the values of )ROM(t") for t" " sub[K +K "], and the same statement is validfor executing Algorithm 3 on input K " and /". Therefore, it is su"cient to provethat, whenever b1, . . . , bn " B$

',

P [&t1K

= b1, . . . , &tnK

= bn] = P [&t1K#

= b1, . . . , &tnK#

= bn].

Let 1K : 1+[sub[K]] % sub[K] be such that, for each t+ " 1+[sub[K]], 1K(t+)is the least t " sub[K] such that 1+(t) = t+, and define 1K

#analogously.

Corollary 2 implies that

P [&t1K

= b1, . . . , &tnK

= bn] =5

t+!)+[sub[K]]\N+

1

|!ran(*(1K(t+)))"|

and

P [&t1K#

= b1, . . . , &tnK#

= bn] =5

t+!)+[sub[K]]\N+

1

|!ran(*(1K#(t+)))"| .

Lemma 6 then implies the result.67

Lemma 7. Let K be a finite set of terms and / be a subterm-compatible or-der. For each t " sub[K], there exists a finite set suppPS(t) such that, if )ROM

is a possible output of Algorithm 3) on input K and using the order /, then)ROM(t) " suppPS(t).

Proof. Let t1, . . . , tn be such that sub[K] = {t1, . . . , tn} and t1 / . . . / tn.We prove the result for all ti such that i " {1, . . . , n} by induction on i. Fori = 0, we have ti " !0 and, at the point of the execution of the Algorithm inwhich )ROM(t0) is sampled, we have dom()ROM) = #. Thus, t0 is either sampledas . or sampled from !ran(ps)", which is finite by our definition of propertystatement and interpretation function. Now, suppose that terms t1, . . . , tk havebeen sampled, so that dom()ROM) = {t1, . . . , tk}, and consider the point ofthe execution of the algorithm in which )ROM(tk+1) is sampled. Let f " !n#

and t"1, . . . , t"n# be such that tk+1 = f(t"1, . . . , t

"n#). By the induction hypothesis,

for each i " {1, . . . , k}, there exists a finite set suppPS(ti) such that, for anyexecution of the algorithm, )ROM(ti) " suppPS(ti). We have

()ROM(t"1), . . . ,)ROM(t"n#)) " suppPS(t"1), . . . , suppPS(t

"n#));

therefore, there exists a finite set PSk+1 ! PSf of property statements such that,for each possible )ROM, there is exactly one ps " PSk+1 such that ()ROM(t"1), . . . ,)ROM(t"n#)) " !dom(ps)". Then, ti is either sampled to ., or to )ROM()ROM(t"i))for some i " {1, . . . , k}, or to some element of

"ps!PSk+1

!ran(ps)". This is a

union of finitely many finite sets; thus, we can choose suppPS(tk+1) to be thisset. 67

Lemma 8. The set %# is a semi-ring of sets.

Proof. If t1 is any term and ' = {t1 2% #}, then %" = #. Thus, # " %#.Let ' = {ti 2% Bi | i " {1, . . . , n}} and '" = {t"i 2% B"

i | i " {1, . . . , n"}}. Let

dom(') $ dom('") = {t""1 , . . . , t""n##} .

There are sets C1, . . . , Cn## , C "1, . . . , C

"n## such that, choosing

'$ = {t""i 2% Ci | i " {1, . . . , n""}} and '"$ = {t""i 2% C "

i | i " {1, . . . , n""}} ,

we have %"! = %"#! . These sets can be obtained by simply choosing Ci = '(ti) if

ti " dom(') and Ci = B$' otherwise and, analogously, C "

i = '"(ti) if ti " dom('")and C "

i = B$' otherwise.

We have that%" +%"# = %"! +%"#

! = %"##! ,

where '"" = {ti 2% Ci + C "i | i " {1, . . . , n""}}. Thus, %# is closed for intersections.

For each i " {1, . . . , n""}, let C0i = Ci + C "

i and C1i = Ci \ C "

i, and considerthe set &(','") of functions '"": {t""1 , . . . , t""n##} %B $

' such that, for each i "{1, . . . , n""}, '""(t""i ) is either C0

i or C1i . Let '""

0 be the element of &(','") suchthat, for each i " {1, . . . , n""}, '""

0(t""i ) = C0

i .We have %"## " %# for all '"" " &(','"), %" =

!"##!#(","#) %"## , and %" +

%"# = %"##0.

We conclude that

%" \%"# =.

"##!#(","#)\{"##0 }

%"##

is a finite, disjoint union of elements in %#. Thus, %# is a semi-ring of sets. 67

Lemma 9. Suppose that µ(#) = 0, µ(!) = 1 and, whenenver '1, . . . ,'n " &are such that %"1 , . . . ,%"n are disjoint sets such that

!ni=1 %"i = %" for some

' " &, we have

µ(%") =n4

i=1

µ(%"i).

Then, there is a unique extension of µ to F that is a probability measure.

Proof. By Lemma 8 and Caratheodory’s extension theorem, it is su"cient toshow that, if 'i " & for each i " N are such that %"i + %"j = # wheneveri *= j and there is ' " & such that

!i!N %"i = %", then there are finitely many

indexes i1, . . . , in " N such that

%" =n.

j=1

%"ij.

For each i " N, let us consider the number ki and, for each j " {1, . . . , ki},the terms tji and the sets of bitstrings Bj

i such that

'i =#tji 2% Bj

i | j " {1, . . . , ki}$.

For each i " N and each j " {1, . . . , ki}, let 'PSi =

#tji 2% Bj

i + suppPS(tji )$,

where suppPS is as in Lemma 7. Analogously, let k " N, t1, . . . , tk " T! ,and B1, . . . , Bk be such that ' =

'tj 2% Bj | j " {1, . . . , k}

(, and define 'PS ='

tj 2% Bj + suppPS(tj)(.

Lemma 7 implies that Bji +suppPS(t

ji ) is finite for all i and all j, and (together

with the definition of µ),µ(%"i) = µ(%

"PSi)

andµ(%") = µ(%"PS).

Thus, we may assume without loss of generality that all the sets Bji are such

thatBj

i ! suppPS(tji ).

For each t " T! , consider the topological space suppPS(t) where all subsetsare open. This space is finite and, therefore, it is trivially compact. Now, considerthe topological space

F = {$ : T! % B$' | $(t) " suppPS(t) for all t} .

F is the cartesian product of the topological spaces associated to each term t. Theopen sets in this topological space with the product topology are precisely thesets %" for functions ' " %# such that, for each t " dom('), '(t) ! suppPS(t).By Tychono!’s theorem, F with the product topology is also a compact space.

Because the open sets of F form a semi-ring (by an argument entirely analo-gous to the one we used in the proof of Lemma 8), we know that F \ r is a finiteunion of open sets, and thus is also open. We conclude that r is closed. BecauseF is compact, it follows that r is also compact. Since {%"i | i " N} is a opencover of %", there must be a finite sub-cover — that is, there must be indexesi1, . . . , im such that

%" =m.

k=1

%"ik.

And because the%"i are disjoint, we conclude that%"i = # for all i /" {i1, . . . , im}.The result then follows from

µ(%") = µ(.

i!N%"i) = µ(

m.

k=1

%"ik) =

m4

k=1

µ(%"jk) =

4

i!Nµ(%"j ).

67

Proof (Theorem 5). It is trivial to check that µ(#) = 0 and µ(!) = 1. By Lemma9, it is su"cient to prove that, whenever %1, . . . ,%n " %# are pairwise disjointand there is ' " & such that %" =

!ni=1 %i, then µ(%") =

7ni=1 µ(%"i).

For each i " {1, . . . , n}, we have %"i =!

$!(("i)%$. Whenever i *= j,

we have %"i + %"j = #, and thus also %$i *= %$j whenever )i " 0('i) and)j " 0('j). It follows that

n.

i=1

.

$!(("i)

%$ =.

$!((")

%$.

Now, if '" is such that %"# =!

$!(("#) %$, it is clear that the function )ROM

output by Algorithm 3 on input dom('") is such that )ROM(t) " '"(t) for allt " K if and only if )ROM " 0('"). Therefore, it follows that, for all ',

µ(%") =4

$!((")

µ(%$).

We obtainµ(%") = µ(

.

$!((")

%$) =4

$!((")

µ(%$)

=n4

i=1

0

14

$!(("i)

µ(%$)

2

3 =n4

i=1

µ(%"i),

which proves that µ is a probability measure. It remains to prove µ(%PS,!·",)R) =

1. Let f " ! and tn+1 = f(t1, . . . , tn). By Lemma 4, any execution of Algorithm3 on input {tn+1} yields a function )ROM which satisfies some * " IS(sub(tn+1)).Thus, if ()ROM(t1), . . . ,)ROM(tn)) /" domS(f), then )ROM(tn+1) = .; otherwise,there is some ps " PSf such that ()ROM(t1), . . . ,)ROM(tn)) " !dom(ps)", inwhich case we must have )ROM(tn+1) " !ran(ps)".

If f " !n, t1, . . . , tn are terms, and ps " PSf , let us write %f,t1,...,tn,ps

for the set of $ " ! such that ($(t1), . . . ,$(tn)) " !dom(ps)" and $(tn+1) /"!ran(ps)". Analogously, let us write %f,t1,...,tn,' for the set of $ " ! such that($(t1), . . . ,$(tn)) /" domS(f) and $(tn+1) *= .. If $ *|= PS, we must have$ " %f,t1,...,tn,ps $%f,t1,...,tn,' for some f " !n, some terms t1, . . . , tn and someps " PSf , and we have seen above that

µ(%f,t1,...,tn,ps) = µ(%f,t1,...,tn,') = 0

for all such f, t1, . . . , tn, ps. Since there are only countably many possible choicesfor f, t1, . . . , tn, ps, it follows that

µ8'

$ | $ |=!·" PS(9

= 1. (3)

On the other hand, suppose that t1, t2 " T! are terms such that t1 &R t2,and assume without loss of generality that t1 / t2. Any execution of Algorithm 3on input {t1, t2} will sample )ROM(t2) as )ROM(t1) on line 9. Letting )(b1, b2) ={t1 2% b1, t2 2% b2} whenever b1, b2 " B$

', we have that

{$ " ! | $(t1) *= $(t2)} =:

b1!B!"

:

b2!B!"\{b1}

%$(b1,b2)

is a set in F , andµ($ " ! | $(t1) *= $(t2)) = 0.

Since there are only countably many choices for t1 and t2, it follows that

µ ({$ " ! | $ |=&R}) = 1. (4)

Combining equations (3) and (4), we conclude that µ(%S) = 1, as desired.67

Let K be a set of terms and ) " 0K . As in Algorithm 3, we will denote byW ()) the partition on sub[dom())]+TW induced by ): thus,W ()) = P ()) |TW ,where P ()) =

')#1(b) | b " ran())

(. We say that ) is a colliding instatiation

of K if there exist terms t1, t2 " sub[K] such that t1 is strong (i.e., t1 /" TW ),t1 *&W ($) t2, (*()))(t1) *= ., and )(t1) = )(t2). We write 0 col

K for the set of) " 0K that are colliding instantiations of K, and

%col(K) =:

$!(colK

%$.

Theorem 8. If K is a finite set of terms and / is a subterm-compatible order,then

(1) for any ' " &K , µK,(% (%" +%col) = µROM,%(%" +%col);

(2) there exists a constant c(K) such that

µK,(% (! \%(K)) = µROM,%(! \%(K)) 3 c(K) · |IS(K)| · (1/L).

Proof. Let t1, . . . , tn be such that sub[K] = {t1, . . . , tn} and t1 / . . . / tn.Let &tK,( be the random variable representing the output of Algorithm 1 whenexecuted on input K and using the order /, and &t be the random variablerepresenting the output of Algorithm 3 on input K. It is su"cient to prove theproperty for the sets %$ for all ) " 0K . For all ) " 0K , we have

µK,(ROM,%(%$) =

6ni=1 P [&ti

K,(= )(ti) | j < i 4 &tj

K,(= )(tj)] (5)

andµROM,%(%$) =

6ni=1 P [&ti = )(ti) | j < i 4 &tj = )(tj)]. (6)

If ) is a colliding instantiation of K, then %$ +%col(K) = #, and the resultis true since

µK,*% (#) = µROM,%(#) = 0.

Suppose then that ) is not a colliding instantiation of K. Let i " {1, . . . , n},and suppose that )ROM(tj) has been sampled to bj for all j < i. Because )is not a colliding instantiation of K, we have W () | {t1, . . . , ti#1}) = P () |{t1, . . . , ti#1}). Therefore, the steps executed by the two algorithms when sam-pling )ROM(ti) are exactly the same. It follows that

P [&tiK,(

= )(ti) | j < i 4 &tjK,(

= )(tj)] = P [&ti = )(ti) | j < i 4 &tj = )(tj)]

for all i " {1, . . . , n}. Thus, equations (5) and (6) imply (1).To prove (2), for each * " IS(K), each j, k " {1, . . . , n} such that j *= k and

tk " T! \ TW! , and each b " ran(*(tj)), let 0 +

j,k,b be the set of ) such that )

satisfies *, tk " T! \ TW! , ran(*(tk)) *= ., ran(*(tj)) *= ., and )(tj) = )(tk).

We note that, for all ) " 0 colK , there are *, j, k, b such that ) " 0 +

j,k,b, andran(*(tk)) : L.

Letting 0j,k be the set of ) " 0 colK such that )(tj) = )(tk), we have

P [0j,k] 34

+!IS(K)

4

b!ran(+(tj))

4

$!($j,k,b

-n5

i=1

1

|ran(*(ti))|

/

Now, we note that, for each * " IS(K), we have

%%0 +j,k,b

%% 35

i=1,i +=j,k

|ran(*(ti))| .

Therefore, the previous equation implies

P [0j,k 34

+!IS(K)

0

;1|ran(*(tj))| ·

0

;1n5

i=1i +=j,k

|ran(*(ti))|

2

<3 ·-

n5

i=1

1

|ran(*(ti))|

/2

<3 3 1

L.

It follows that

P [0 colK ] 3

4

+!IS(K)

0

;1n4

j=1

0

;1n4

k=1k +=j

P [0j,k]

2

<3

2

<3 3 |K|24

+!IS(K)

1

L,

concluding the proof.67

Proof (Theorem 6). Follows from Theorem 8, by taking %(K) = %col(K).67

Appendix B: Computing Our Probability Distribution

5.1 Definitions and results

We define additional types error and empty, and extend !·" such that !error" ={.} and !empty" = #. Furthermore, when X ! B$, we may use a type TX /" T ,and further extend !·" such that !TX" = X. Note that we still refer to !·" asan interpretation function. Let K be a finite set of terms and * " I(K) be aselection function for K. We define the supremum *-support function supp+:K %T $ {error} by error if *(t) = . and supp+(t) = ran(*(t)) otherwise.

&+P and &$

P -equivalence classes. If t " sub[K], we define

[t]$P ='(1$)#1(t) | t " 1$[sub[K]] \ N+

(

to be the 1$-equivalence class of t. Analogously,

[t]+P ='(1+)#1(t) | t " 1+[sub[K]]

(

is the 1$-equivalence class of t. Similarly, if K is a set of terms, we denoteby [K]$P (respectively, [K]+P ) the set of 1$-equivalence classes (respectively, 1+-equivalence classes) of terms in K. We define &+

P (respectively, &$P ) as the

equivalence relation such that t &+P t" (respectively, t &+

P ) if 1+(t) = 1+(t")(respectively, t &$

P t"). Since it is clear that &P!&$P , for each &$

P -equivalenceclass C, there is a &+

P -equivalence class C " such that, for all t " C, [t]$P = C ";for each C " [K]+P , we denote by [C]$P this unique class C ".

Let / be some subterm-compatible order. For each ' " &K and each C "[sub[K]]+P , we define supp+(C) to be ran(*(t)), with t being least t with respectto / such that t " C and *(t) *= .. In light of Lemma 6, supp+ is well-definedand, for each C " [sub[K]]+P and each t " C, either supp+(t) = . or supp+(C) !supp+(t). Note also that, if there is a " !0 such that a " C, then supp+(C) *=error.

Infimum support. Given a finite set of terms K, ' " &K and * " I(K), we definePWR,'(*) as the set of partitions P of sub[K]+TW for which there is p' " P such

that, for all t " sub[K] + TW , t " p' if and only if *(t) = .. If P " PWR,'(*), we

define the infimum (), *, P )-support function for each t " sub[K] as follows: Foreach t " sub[K], supp",+,P (t) is the smallest set such that:

– supp+(t) ! supp",+,P (t);– if t " K, then T"(t) " supp",+,P (t);– if t &P t", T " supp",+,P (t

"), and error /" {supp+(t), supp+(t")}, then T "supp",+,P (t).

For each class C " [K]$P there is a (finite) set T such that, for each t " C,we have supp",+,P (t) = T or supp+(t) = error. We define supp",+,P (C) for eachclass C " [K]$P as (1) empty if there is t " C such that supp+(t) = error andthere is T " supp",+,P (t) such that . /" !T ", and (2) supp",+,P (C) = T (where

T is as above) otherwise. Note that supp",+,P (C) = empty if and only if there ist " C such that supp+(t) = error and there is T " supp",+,P (t) such that eitherT " T or T = TB for some B ! B$ which does not contain .. Furthermore,supp",+,P (C) = error if and only if supp",+,P (t) = error for all t " C +K.

Function sets. Given a finite set of terms K, ' " &K , * " I(K), and P "PWR,'(K), let us denote by WC(K) the set [sub[K] + TW ]+P . We define the fol-

lowing sets:

– 0D(', *, P ) is the set of functions ): [sub[K]]+P % B$' such that, for all C "

[sub[K]]+P , )(C) " !supp+(C)";– 0U (', *, P ) is the set of functions ): [sub[K]]$P % B$

' such that, for all C "[sub[K]]$P , )(C) " supp",+,P (C) and, if C,C " " WC(sub[K]) are distinct,then )(C) *= )(C ").

If ) " 0K , we will also write 0D(), *, P ) and 0U (), *, P ) to denote the sets0D('()), *, P ) and 0U ('()), *, P ), where '()) " &K is defined by '())(t) ={)(t)} for all t " K.

Definition 3. We define the function µC :& % [0, 1] for each ' " & by

µC(') =4

+!I(K)

0

14

P!PWR,"(K)

%%0U (', *, P )%%

|0D(', *, P )|

2

3 ,

where K = dom(').

Theorem 9. If K is a finite set of terms and ','" " &K are such that %" =%"# , then µC(') " [0, 1] and µC(') = µC('").

In light of Lemma 9, we use the symbol µC for the function µC :%# % [0, 1]defined, for each ' " &, by µC(%") = µC(').

Theorem 10. There exists a unique extension of µC to F that is a probabilitymeasure. Abusing notation and using the symbol µC to refer to this extension,we have µC = µROM.

5.2 Proofs

Lemma 10. For all valid setups S and all finite sets of terms K, IS(K) isfinite.

Proof. Let t1, . . . , tn be such that sub[K] = {t1, . . . , tn} and t1 / . . . / tn. Fori " {0, . . . , n}, let Ki = {t1, . . . , ti}. We prove the result for each Ki by inductionon i. We have IS(K0) = #, and thus the result holds.

Now let i " {1, . . . , n}, and suppose that the result holds for all j " {1, . . . , i- 1}.For each t " Ki#1, let

suppi#1(t) =:

+!IS(Ki$1)

!ran(*(t))" .

Suppose that *i " IS(Ki). For all $ " ! such that $ |= *i, we also have$ |= *i |Ki$1 . Thus, *i |Ki$1" IS(Ki) and, letting ti = fi(t"1, . . . , t

"k), we have

$(t"j) " suppi#1(t"j) for all j " {1, . . . , k}. The induction hypothesis implies that

suppi#1(t"j) is finite for all j. Therefore, the set suppi#1(t

"1)) . . .) suppi#1(t

"k)

is finite and, if $ |= *i, then

($(t"1), . . . ,$(t"k)) " suppi#1(t

"1)) . . .) suppi#1(t

"k).

Since property statements are assumed to be disjoint, it follows that there existsa finite set P ! PSfi such that, whenever $ |= *i,

($(t"1), . . . ,$(t"k)) " !dom(ps)"

for some ps " P . It follows that

IS(Ki) ! IS(Ki#1)) (P $ {.}),

and the induction hypothesis implies that IS(Ki) is finite. 67

Corollary 3. There exists a function suppPS:T! % P(B$') such that suppPS(t)

is finite for all t " T! and, whenever K is a finite set of terms and ' " &K , wehave µC(%") = µC(%"PS

), where 'PS " &K is given by 'PS(t) = '(t)+suppPS(t)for all t " K.

Proof. For each term t, let

suppPS(t) =:

+!I(sub(t))

!ran(*(t))" .

Lemma 10 implies that there are finitely many * " IS(K); therefore, suppPS(t)is finite.

It is simple to check that, if * " IS(K), then * |sub(t)" IS(sub(t)). Thus, if

* " IS(K), then !ran(*(t))" ! suppPS(t). For all * " IS(K) and all P " PWR (K),

we have #supp",+,P (t)

$! !supp+(t)" = !ran(*)" (t) ! suppPS(t),

and thus %supp"PS,+,P (t)

&=

#supp",+,P (t)

$+ suppPS(t)

=#supp",+,P (t)

$.

It follows that 0U (', *, P ) = 0U ('PS, *, P ) for all * " IS(K) and all P "PWR (K). Thus, we have µC(%") = µC(%"PS

), as desired. 67

Let ' " & and K = dom('). For each t " K, let

suppK(t) =:

+!IS(K)

!ran(*(t))" .

Recall the notion of 0('); We now introduce the related notion 0S('), definedfor each ' " & such that dom(') = K as the set of all ):K % B$

' such that, forall t " K, )(t) " '(t) + suppK(t). Note that suppK(t) is finite (by Lemma 10),and thus so is 0S(').

Lemma 11. Let ' " &K , ) " 0S('), * " IS(K) and P " PWR (K). If * =

*()) *= ., P = W ()), and there exists )u " 0U (', *()), P ())) such that, foreach t " sub[K]\N $, )(t) = . if supp+(t) = error and )(t) = )u([t]$P ) otherwise,then %%0U (), *, P )

%% = 1.

Otherwise, %%0U (), *, P )%% = 0.

Proof. We prove the following five propositions:

(1) if *()) = . or * *= *()), then%%0U (), *, P )

%% = 0;(2) if P *= W ()), then

%%0U (), *, P )%% = 0;

(3) if there is )u " 0U (), *, P ), then )u " 0U (', *, P ) and, for each t " sub[K],)(t) = . if supp+(t) = error and )(t) = )u([t]$P ) otherwise;

(4) there is at most one )u " 0U (), *, P );(5) if there is )u " 0U (', *, P ) such that, for each t " sub[K], )(t) = . if

supp+(t) = error and )(t) = )u([t]$P ) otherwise, then )u " 0U (), *, P ).

Properties (4) and (5) show that, under the conditions of the theorem,%%0U (), *, P )%% = 1. Properties (1), (2) and (3) show that

%%0U (), *, P )%% = 0 other-

wise.

(1) If * *= *()) or *()) = ., then either (1) there is t " sub[K] such that)(t) /" !ran(*(t))" or (2) there is t = f(t1, . . . , tn) " sub[K] such that, letting*(t) = (f [T1, . . . , Tn] ! T ), we have )(ti) /" !Ti" for some i " {1, . . . , n}. In case(1), we have {)(t)} !

#supp$,+,P ([t]

$P )

$, and !ran(*(t))" !

#supp$,+,P ([t]

$P )

$.

We conclude that%%#supp$,+,P (t)

$%% = 0, and thus%%0U (), *, P )

%% = 0. Analogously,

in case (2), we have {)(ti)} !#supp$,+,P ([ti]

$P )

$, and !Ti" !

#supp$,+,P ([ti]

$P )

$,

and we conclude that%%#supp$,+,P (ti)

$%% = 0, so that%%0U (), *, P )

%% = 0.

(2) If P *= W ()), we have P |$$1(B!) *= P ()) |$$1(B!), and there are t, t" "sub[K] + TW + )#1(B$) such that either (1) t &P t" and )(t) *= )(t") or (2)t *&P t" and )(t) = )(t"). Obviously, it is su"cient to prove that 0U (), *, P ) = #.

In case (1), we have#supp$,+,P

$([t]$P ) ! )(t) and

#supp$,+,P

$([t"]$P ) ! )(t");

since [t]$P = [t"]$P , it follows that#supp$,+,P ([t

"]$P )$= #, and thus 0U (), *, P ) =

#.In case (2), we have

#supp$,+,P

$([t]$P ) ! )(t) and

#supp$,+,P

$([t"]$P ) !

)(t"). Therefore, if ): [sub[K]]$P % B$', we must have )([t]$P ) = )([t"]$P ). Since

[t]$P , [t"]$P " WC(K) and [t]$P *= [t"]$P , we again conclude that 0U (), *, P ).

(3) For each t " sub[K], we have supp$,+,P ! supp",+,P ; therefore, 0U (), *, P ) !

0U (', *, P ).

– We have#supp$,+,P (t)

$! !supp+(t)"; thus,

supp+(t) = error 4#supp$,+,P

$! {.} .

We also have#supp$,+,P (t)

$! {)(t)}; thus,

)(t) *= . 4#supp$,+,P (t)

$= #.

In this case there can be no function in 0U (), *, P ), contradicting the hy-pothesis.

– If supp+(t) *= error, then#supp$,+,P ([t]

$P )

$! {)(t)}. Since )u " 0U (), *, P ),

we must have )u([t]$P ) = )t.

(4) For each C " [sub[K]]$P , either there is t " K +C such that supp+(t) *= erroror not. If there is no such t, then we have

#supp$,+,P (C)

$= {.}. Otherwise,#

supp$,+,P ([t]$P )

$! {)(t)}. Thus, for all C " [sub[K]]$P ,

#supp$,+,P (C)

$is a set

with a single element. Since )u(C) "#supp$,+,P (C)

$for all )u " 0U (), *, P )

and all C " [sub[K]]$P , it follows that 0U (), *, P ) has at most one element.

if there is )u " 0U (', *, P ) such that, for each t " sub[K], )(t) = . ifsupp+(t) = error and )(t) = )u([t]$P ) otherwise, then )u " 0U (), *, P ).

(5) It is su"cient to prove that, for each t " sub[K], )u([t]$P ) "#supp$,+,P ([t]

$P )

$.

Let t " sub[K] and C = [t]$P . If there exists t" " C such that supp+(t") *= error,

then#supp$,+,P (C)

$!

#supp",+,P (C)

$+{)(t")}; because )u(C) = )(t"), we have

)(t") "#supp",+,P (C)

$. In this case, for all t" " C, we have )u(C) = {)(t")} and

{)(t")} =#supp",+,P (C)

$. If no such t exists, then we have

#supp",+,P (C)

$=#

supp",+,P (C)$= {.}, and since )u " 0U (', *, P ), we have )u(C) = . and

)u(C) "#supp$,+,P (C)

$. 67

Lemma 12. Let ' " &K . We have

µC(') =4

$!((")

µC()).

Proof. We have to show that

4

$!((")

µC()) =4

+!IS(K)

0

14

P!PWR (K)

%%0U (', *, P )%%

|0D(', *, P )|

2

3 (7)

We have

µC()) =4

+!IS(K)

0

14

P!PWR (K)

%%0U (), *, P )%%

|0D(), *, P )|

2

3 .

Noting that 0D(', *, P ) = 0D(), *, P ), we conclude

4

$!((")

µC()) =4

+!I(K)

0

14

P!PWR (K)

0

14

$!((")

%%0U (), *, P )%%

|0D(), *, P )|

2

3

2

3 . (8)

By Lemma 11, for each ) " 0('), each * " I(K), and each P " PWR (K),%%0U (), *, P )

%% = 1 if * = *()), P = P ()), and there exists )u " 0U (), *, P ) suchthat, for each t " K, )(t) = . if supp+(t) = error and )(t) = )u([t]$P ) otherwise.Otherwise,

%%0U (), *, P )%% = 0. It is simple to check that, for each )u " 0U (', *, P )

there is one and only one ) " 0(') such that this condition is satisfied. Thus, foreach * " I(K) and each P " PW

R (K), there are%%0U (', *, P )

%% functions ) " 0(')such that

%%0U (), *, P )%% = 1, and

%%0U ()", *, P )%% = 0 for all other )" " 0('). We

obtain 4

$!((")

%%0U (), *, P )%% =

%%0U (', *, P )%% .

Combining this equality and (8), we obtain (7), concluding the proof. 67

Lemma 13. Let 'i " & for each i " {1, . . . , n}. Suppose that the sets %"1 , . . . ,%"n

are pairwise disjoint, and that ' " & is such that %" =!n

i=1 %"i . Then,

µC(%") =n4

i=1

µC(%"i).

Proof. Let K ="n

i=1 dom('i). For each i, there is '"i " &K such that %"#

i= %"i .

Furthermore, there is '" " &K such that %"# = %".Now we note that

0('") =n.

i=1

0('"i),

and the result follows from Lemmas 9 and 12:

µC(%(')) = µC(%('")) =4

$!(("#)

µC(%$)

=n4

i=1

0

14

$!(("#i)

µC(%$)

2

3 =n4

i=1

µC(%"#i) =

n4

i=1

µC(%"i).

67

Lemma 14. Suppose that µC(#) = 0, µC(!) = 1 and, whenenver '1, . . . ,'n "& are such that %"1 , . . . ,%"n are disjoint sets such that

!ni=1 %"i = %" for some

' " &, we have

µC(%") =n4

i=1

µC(%"i).

Then, there is a unique extension of µC to F that is a probability measure.

Proof. The proof is entirely similar to the proof of Lemma 9. The only di!erenceis that instead of considering suppPS as in Lemma 7, we consider suppPS as inCorollary 3. 67

Proof (Theorem 10). Let ' = #. We have ! = %", and it is trivial to check thatµC(') = 1. Moreover, if ' = {t 2% #} for some term t, we have # = %", andµC(') = 0. Therefore, to conclude that µC is a probability measure it su"ces tocombine Lemmas 14 and 13.

Now, µC and µROM are probability distributions (by theorems 10 and 5),and thus the set %( = {%$ | ) " 0} is a generator of F . Therefore, provingthat µC(%$) = µROM(%$) for all ) " 0 is su"cient to prove that µC = µROM.Let K = sub[dom())], and let )" = ) $ {t 2% B$

' | t " K \ dom())}. We have%$ = %$# , and thus µC(%$) = µC(%$#) and µROM(%$) = µROM(%$#), bytheorems 9 and 4. Thus, we can assume without loss of generality that dom()) =sub[dom())] = K.

Lemma 4 and Lemma 11 imply that, if ) does not satisfy P or there is no* " IS(K) such that ) |= *, then µROM()) = µC()) = 0 (note that all parcels inthe sum that defines µC are 0), and the result is proved.

On the other hand, if this is not the case, we have

µC(%$) =

%%0U (), *, P )%%

|0D(), *, P )|

and, by Corollary 2,

µROM(%$) =5

t+!)+[sub[K]]\N+

1

|!ran(*(1K(t+)))"| ,

where 1K : 1+[sub[K]] % sub[K] is the function such that, for each t+ " 1+[sub[K]],1K(t+) is the least t " sub[K] with respect to / such that 1+(1K(t+)) = t+ and*(t) *= . if such a t exists, and . otherwise.

By Lemma 11, we have%%0U (), *, P )

%% = 1. On the other hand, the definitionsof

%%0D(), *, P )%% and supp+ imply that

%%0D(), *, P )%% =

5

C![sub[K]]+P

|!supp+(C)"| =5

t+!)+[sub[K]]\N+

%%#ran(*(1K(t+)))$%% .

Combining the above equations, we conclude that µC(%$) = µROM(%$), provingthe result. 67