algorithms for atomicity

12
Algorithms for Atomicity Azadeh Farzan Carnegie Mellon University [email protected] P. Madhusudan Univ. of Illinois at Urbana-Champaign [email protected] Abstract We study the algorithmics of checking atomicity of in concurrent programs. Unearthing fundamental results behind scheduling algo- rithms in database control, we build space-efficient monitoring al- gorithms for checking atomicity. Second, by interpreting the mon- itoring algorithm as a deterministic automaton, we solve several key model checking problems for checking atomicity of finite-state concurrent models. We establish two main results: that checking atomicity for concurrent regular Boolean programs is PSPACE- complete, and that checking concurrent recursive programs is de- cidable. The results are based summarizing programs using profiles of runs, and checking if the profiles witness non-serializability. 1. Introduction Correct concurrent programs are way harder to write than sequen- tial ones. Correct sequential programs are already hard to write, maintain, and test, though the tremendous effort over the last 20 years in finding bugs in programs has yielded some tractable ap- proaches and tools to assure correctness. The advent of multi-core technologies and the increasing use of threads and communicating modules in software design has brought all concurrency issues to the forefront. Consequently, one of the most important problems in software analysis is to understand concurrency idioms used in prac- tice, and leverage the understanding to build testing and verification tools. A large class of concurrent programs, especially application pro- grams, are loosely coupled and sequentially reasoned. Program- mers are much better at thinking about sequential programs, and they often write modules and methods to be first correct when run sequentially, without interruptions from other threads. These se- quential blocks of code are encased then within some simple con- currency control mechanisms, such as obtaining locks before ac- cessing global variables, to mutually exclude themselves while ac- cessing global data and hence preventing non-trivial interactions between threads that may lead to unexpected errors. [Copyright notice will appear here once ’preprint’ option is removed.] The above view motivates a remarkable generic specification for concurrent programs called atomicity. Consider a concurrent pro- gram partitioned into certain blocks of code, which correspond to sequential logical units, like methods in a class. A run r of such a program is said to be serializable if we can find another run r 0 that is equivalent to r and where r 0 is a serial execution of the blocks of code in r. In other words, every interleaved run is equivalent to a run where code blocks are not interrupted. A program is said to be atomic if all runs are serializable. Atomicity is clearly a very desirable property. Intuitively, assuming a concurrent program has been reasoned to be sequentially correct, atomicity assures that every concurrent run is equivalent to a se- rial run, and hence correct. Hence the mechanism of interleaving the threads in an execution never leads to new errors in an atomic program. Proving atomicity of a program hence reduces the com- plexity of the program, as it effectively removes the concurrency from it. Atomic programs can be tested serially, subject to sequen- tial reasoning, and model-checked using well-studied algorithms and tools for sequential programs. The notion of serializability was introduced in database concur- rency control. When transactions concurrently access a database, they expect a serial uninterrupted execution of their code in order to ensure correctness and consistency (for example, a transaction could debit one bank account and credit another without interrup- tions). Database schedulers must run transactions in parallel (for efficiency and deadlock prevention), and must therefore install a mechanism that ensures that the interleaved run executed on the database is serializable, i.e. that the updates to the database are con- sistent with some serial execution of the transactions. The definition of serializability crucially depends on what one defines to be the equivalence between runs. The only tractable ways of viewing this notion is to abstract from the actual data read and written by the threads, and assume that the transactions compute (and write) some uninterpreted function of the values they have read. In this setting, the most general definition of equivalence will demand that the view every transaction sees along the runs are the same. It turns out that view serializability is too hard to check (NP-hard even for a single run) and is not well-behaved (for example, view-serializability is not monotonic, i.e. it is possible that a schedule is view-serializable but a projection of it is not!). The most well-studied and useful notion of serializability is conflict serializability. Intuitively, we declare events to be dependent if they cannot be commuted— so, two accesses to the same variable are dependent if one of them is a write. Two runs r and r 0 are said to be equivalent if we can obtain one from the other by commuting only independent events in the run. A run is conflict-serializable if it has an equivalent serial run. We study only conflict-serializability 1 2007/7/17

Upload: others

Post on 17-Jan-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algorithms for Atomicity

Algorithms for Atomicity

Azadeh Farzan

Carnegie Mellon University

[email protected]

P. Madhusudan

Univ. of Illinois at Urbana-Champaign

[email protected]

Abstract

We study the algorithmics of checking atomicity of in concurrentprograms. Unearthing fundamental results behind scheduling algo-rithms in database control, we build space-efficient monitoring al-gorithms for checking atomicity. Second, by interpreting the mon-itoring algorithm as a deterministic automaton, we solve severalkey model checking problems for checking atomicity of finite-stateconcurrent models. We establish two main results: that checkingatomicity for concurrent regular Boolean programs is PSPACE-complete, and that checking concurrent recursive programs is de-cidable. The results are based summarizing programs using profilesof runs, and checking if the profiles witness non-serializability.

1. Introduction

Correct concurrent programs are way harder to write than sequen-tial ones. Correct sequential programs are already hard to write,maintain, and test, though the tremendous effort over the last 20years in finding bugs in programs has yielded some tractable ap-proaches and tools to assure correctness. The advent of multi-coretechnologies and the increasing use of threads and communicatingmodules in software design has brought all concurrency issues tothe forefront. Consequently, one of the most important problems insoftware analysis is to understand concurrency idioms used in prac-tice, and leverage the understanding to build testing and verificationtools.

A large class of concurrent programs, especially application pro-grams, are loosely coupled and sequentially reasoned. Program-mers are much better at thinking about sequential programs, andthey often write modules and methods to be first correct when runsequentially, without interruptions from other threads. These se-quential blocks of code are encased then within some simple con-currency control mechanisms, such as obtaining locks before ac-cessing global variables, to mutually exclude themselves while ac-cessing global data and hence preventing non-trivial interactionsbetween threads that may lead to unexpected errors.

[Copyright notice will appear here once ’preprint’ option is removed.]

The above view motivates a remarkable generic specification forconcurrent programs called atomicity. Consider a concurrent pro-gram partitioned into certain blocks of code, which correspond tosequential logical units, like methods in a class. A run r of such aprogram is said to be serializable if we can find another run r′ thatis equivalent to r and where r′ is a serial execution of the blocks ofcode in r. In other words, every interleaved run is equivalent to arun where code blocks are not interrupted. A program is said to beatomic if all runs are serializable.

Atomicity is clearly a very desirable property. Intuitively, assuminga concurrent program has been reasoned to be sequentially correct,atomicity assures that every concurrent run is equivalent to a se-rial run, and hence correct. Hence the mechanism of interleavingthe threads in an execution never leads to new errors in an atomicprogram. Proving atomicity of a program hence reduces the com-plexity of the program, as it effectively removes the concurrencyfrom it. Atomic programs can be tested serially, subject to sequen-tial reasoning, and model-checked using well-studied algorithmsand tools for sequential programs.

The notion of serializability was introduced in database concur-

rency control. When transactions concurrently access a database,they expect a serial uninterrupted execution of their code in orderto ensure correctness and consistency (for example, a transactioncould debit one bank account and credit another without interrup-tions). Database schedulers must run transactions in parallel (forefficiency and deadlock prevention), and must therefore install amechanism that ensures that the interleaved run executed on thedatabase is serializable, i.e. that the updates to the database are con-sistent with some serial execution of the transactions.

The definition of serializability crucially depends on what onedefines to be the equivalence between runs. The only tractable waysof viewing this notion is to abstract from the actual data read andwritten by the threads, and assume that the transactions compute(and write) some uninterpreted function of the values they haveread. In this setting, the most general definition of equivalencewill demand that the view every transaction sees along the runsare the same. It turns out that view serializability is too hard tocheck (NP-hard even for a single run) and is not well-behaved (forexample, view-serializability is not monotonic, i.e. it is possiblethat a schedule is view-serializable but a projection of it is not!).

The most well-studied and useful notion of serializability is conflict

serializability. Intuitively, we declare events to be dependent if theycannot be commuted— so, two accesses to the same variable aredependent if one of them is a write. Two runs r and r′ are said tobe equivalent if we can obtain one from the other by commutingonly independent events in the run. A run is conflict-serializable ifit has an equivalent serial run. We study only conflict-serializability

1 2007/7/17

Page 2: Algorithms for Atomicity

in this paper, and will henceforth refer to it as simply serializability,and the corresponding notion for programs as atomicity.

The objective of this paper is to study the algorithmics of checkingserializability. We study two main problems: (a) monitoring runsof concurrent code for serializability violations, and (b) modelchecking finite-state models of programs for atomicity.

Conflict-serializable schedulers work by implementing essentiallya monitor for detecting non-serializable schedules; these sched-ules necessitate aborting certain transactions (with appropriateroll-backs). However, analyzing concurrent programs is differentfrom studying database schedulers in many ways. First, the con-trol mechanism for ensuring serializability in database systems lieswith the scheduler, and not the transactions, while in concurrentprograms, the code at individual threads must ensure control mech-anisms (such as locks) to ensure serializability. Second, schedulingtransactions is a much more complex problem as the problem is toschedule actions and handle the complexities of rollback, aborts

and deadlock prevention. In contrast, monitoring is passive anddemands only the detection of non-serializable runs. Third, in soft-ware, threads represent sequential executions of transactions, whiledatabases treat all transactions as concurrent.

The first result of this paper is an adaptation of scheduling algo-rithms in databases to build space-efficient monitors for serializ-ability. Serializability checkers maintain a conflict graph, which isa graph depicting the precedence order imposed by the run on thetransactions. A run is serializable iff this graph remains acyclic;however this graph can grow very large as it stores the informa-tion of all transactions that occur in the run. It turns out that de-signing monitors that use only a bounded amount of space (moreprecisely, space related only to the number of active threads andentities accessed, and independent of the length of the run) is asubtle problem. A tempting idea is to remove completed trans-

actions from the conflict-graph, replacing it with transitive edges

that summarize its effect. In fact, in a paper by Alur, McMillanand Peled [2], automata-based algorithms were designed for check-ing serializability that overlooked this subtlety and deleted transac-tions, resulting in an erroneous algorithm.

It turns out that the study of when completed transactions can be re-moved is well-studied in database theory [11, 15]. Unearthing tech-niques from designs of database schedulers, we obtain bounded-space algorithms for monitoring runs for serializability. The efforthere on our part was in understanding the scheduling mechanismsand adapting them to software programs, with special regard tohandle threads, and remove the complexity of handling aborts androll-backs. The scheduling algorithms we use are essentially con-tained in four works: two textbooks on database theory [15, 4], a pa-per on the combinatorics of removing completed transactions andrelated deletion policies on the conflict-graph [11], and a volumedevoted to concurrency control by Casanova [5].

Turning to model checking, observe that when there are a finitenumber of threads and entities, the monitoring algorithm designedabove uses a bounded amount of space, and hence can be in-terpreted as a deterministic finite automaton. This opens up theautomata-theoretic techniques for model checking programs. Intu-itively, any concurrent program that can be modeled using a finite-automaton is effectively checkable for atomicity violations. Weprove that concurrent Boolean programs (where each thread runs aprogram with a regular control structure and where all variables areinterpreted over finite domains), the model checking problem for

atomicity is PSPACE-complete. Note that a similar claim appearsin [2], but is incorrect due to the grave error we mentioned.

The above mechanism does not however lead to checking concur-rent programs where each thread runs a recursive control (i.e. mod-els recursive calls). In order to understand serializability check-ing better, we embark on a study of finding exact complexity ofmodel checking various restricted program models: straight-lineprograms, branching programs generating single transactions, andbranching programs generating iterative transactions.

We show straight-line programs checkable in polynomial time, andinvolves establishing a normal form for witnesses of runs for non-serializability. This setting is also motivated from testing: given atest for a concurrent program, each thread can be seen to execute asingle line of code, and our result argues that checking whether all

interleavings of it are serializable is efficiently checkable.

Moving on to branching programs, we establish that checkingnon-atomicity is NP-complete, using a new witness for non-serializability that essentially picks at most two events from eachthread. Every non-serializable run essentially contains at most twoevents in each thread (which we call a profile) that leads to a con-flicting cycle.

The above study leads to our second main result: the problem ofchecking programs that execute recursive control in each threadadmit a decidable model checking problem for atomicity. We findthis result surprising as checking most global properties (serializ-ability is a global property) of recursive concurrent programs is un-

decidable. Our decision procedure works by extracting the profileslocally from each recursive program, and combine the profiles tocheck for serializability violations.

In summary, the paper presents for the first time the algorithmicsof checking serializability in threaded software. While many atom-icity checkers for programs have been proposed, all of them resortto some approximation of serializability, either using the notion ofreductions by Lipton [14] or other heuristic approximations. Themain contribution of the paper is to expose the fundamental ideasof serializability existing in the database literature, and exploit themto build space-efficient atomicity checkers, and obtaining crisp al-gorithms and complexity results that expose the essential elementsof checking atomicity.

Related Work

Atomicity is a new notion of correctness for concurrent programs.The more classical notion is race-checking: a race is a pair ofaccesses to the same variable by different threads, where one of theaccesses is a write [19, 20, 21]. Data-races also signal unintendedsynchronizations of code, and is routinely used to find errors inconcurrent programs (see [20, 21] for testing and runtime checkingfor data-races and the work in [22, 23] for static data-race analysis).It is well known that data-races are often benign, and lack ofraces is not a property that one expects in all correct programs.It has been suggested [9, 8, 17, 16] that atomicity notions based on

serializability are more appropriate and yield less false positives;some practical tools built for serializability actually demonstratethis [18]. Most work in software verification for atomicity errors arebased on the Lipton-transactional framework. Lipton transactionsare sufficient (but not necessary) thread-local conditions that ensureserializability [14]. Flanagan and Qadeer developed a type system

2 2007/7/17

Page 3: Algorithms for Atomicity

for atomicity [9] based on Lipton transactions. However, the typeinference for the type system is NP-complete, and hence the typesystem may require manual annotation of the program.

Runtime analysis is less complete than static analysis and cannotprove a program atomic; it is however more precise as only validruns of the program are executed, and give less false alarms. Wangand Stoller study run-time algorithms for various notions of seri-alizability; the block-based algorithm of [16] checks view serial-izability of the runs which is an expensive algorithm as check-ing view- serializabilty of a single run is an NP-Complete prob-lem. Reduction-based algorithms for checking atomicity [8, 9, 16]classify events based on commutativity and apply Liptons reduc-tion theorem [14]. As discussed in [17], reduction-based algorithmscheck for a notion of atomicity that is strictly stronger than con-flict serializability, and therefore may report some programs thatare conflict serializable as non-atomic.

In [17] two variants of serializability, namely conflict and view,are discussed, and algorithms for dynamic detection of violationsof conflict serializability and view serializability are presented.These algorithms are off-line algorithms, i.e. they are applied tothe recorded execution of the program after it is finished. Ourmonitoring algorithm is, in contrast, an online algorithm. Also, thework in [17] does not give precise complexity bounds, and oftenresorts to simpler approximations of serializability. while checkingview serializability for a single run is NP-Complete, checking bothfor a program of the above type will be NP-Complete.

A variant of dynamic two-phase locking algorithm [15] for detec-tion of an serializability violation is used in the atomicity tool de-veloped inc̃ite1065013. As discussed in [15], the set of runs thatare detected as atomic by a two-phase locking algorithm are a strictsubset of the set of conflict serializable runs.

Model checking has also been used to check atomicity using Lip-ton’s transactions [8, 12].

2. Modeling Runs of Concurrent Programs

A program consists of a set of threads that run concurrently. Eachthread sequentially runs a series of transactions. A transaction isa sequence of actions; each action can be reading or writing to a(global) variable.

We assume an infinite but countable set of thread identifiers T ={T1, T2, . . . , }. We also assume a countable set of entity names(or just entities) X = {x1, x2, . . . , } that the threads can access.Each thread can perform actions from a set of actionsA defined as:A = {read(x),write(x) | x ∈ X}.

Let us define the alphabet of events of a thread T ∈ T to beΣT = {T :a | a ∈ A} ∪ {T :B, T :C} ∪ {BegT ,EndT } Theevents T :read(x) and T :write(x) correspond to thread T readingand writing to entity x, T :B and T :C correspond to boundariesthat begin and end transactional blocks of code in thread T , whileBegT and EndT denote the creation and termination of the threadT itself. Let Σ =

ST∈T ΣT denote the set of all events.

For any alphabet A, w ∈ A∗, let w[i] denote the i’th element of w,and w[i, j] denote the substring from position i to position j in w.For w ∈ A∗ and B ⊆ A, let w|B denote the word w projected to

the letters in B. For a word σ ⊆ Σ∗, σ|T be a shorthand notationfor σ|ΣT , which denotes the actions that thread T partakes in.

The following defines the notion of behaviors of a concurrent pro-gram, which we call a schedule (inspired by the database connec-tion). A schedule is simply an execution of the program in threadexecutions are interleaved in any arbitrary fashion.

Definition 1 A schedule is a word σ ∈ Σ∗ such that for eachT ∈ T , σ|T is a prefix ofBegT · [(T :B) · ({T :a | a ∈ A})∗ · (T :C)]∗ · EndT .

In other words, the actions of thread T start with BegT and endwith EndT , and the actions within are divided into a sequence oftransactions, where each transaction begins with T :B, is followedby a set of reads and writes, and ends with T :C. Let Sched denotethe set of all schedules.

A transaction tr of thread T is a word of the form T :B w T :C,where w ∈ {T :a | a ∈ A}∗. Let TranT denote the set ofall transactions of thread T , and let Tran denote the set of alltransactions.

When we refer to two particular events σ[i] and σ[j] in σ, we saythey belong to the same transaction if they belong to the sametransaction block: i.e. if there is some T such that σ[i] = T :a,σ[j] = T :a′, where a, a′ ∈ A, and there is no i′, i < i′ < jsuch that σ[i] = T :C. We will refer to the transaction blocks freelyand associate (arbitrary) names to them, using notations such astr, tr1, tr

′, etc.

Example 2 The following is an example of a schedule:

beginT1T1:B T1:read(x) T1:write(y) beginT2

T2:B

T2:read(z) T1:write(z) T2:write(y) T1:C T1:B

T2:C T1:write(x) endT2 T1:C endT1

The subwords

T1:B T1:read(x) T1:write(y) T1:write(z) T1:C

and

T1:B T1:write(x) T1:Cand

T2:B T2:read(z) T2:write(y) T2:Care transactions in the schedule.

Defining atomicity

We now define atomicity as the notion of conflict serializability inthe database literature.

The dependency relationD is a symmetric relation defined over theevents in Σ, and captures the dependency between (a) two eventsaccessing the same entity, where one of them is a write, and (b) anytwo events of the same thread, i.e.,

D = {(T1:a1, T2:a2) | T1 = T2 and a1, a2 ∈ A ∪ {B,C} or

∃x ∈ X such that

(a1 = read(x) and a2 = write(x)) or

(a1 = write(x) and a2 = read(x)) or

(a1 = write(x) and a2 = write(x))}

3 2007/7/17

Page 4: Algorithms for Atomicity

Definition 3 (Equivalence of schedules) The equivalence of sched-ules is defined as the smallest equivalence relation ∼ ⊆Sched × Sched such that the following condition holds:

• if σ = ρabρ′, σ′ = ρbaρ′ ∈ Sched with (a, b) 6∈ D, thenσ ∼ σ′.

It is easy to see that the above notion is well-defined. Two sched-ules are considered equivalent if we can derive one schedule fromthe other by iteratively swapping consecutive independent actionsin the schedule. It is clear (given that two actions accessing an en-tity are independent only if both are reads), that equivalent sched-ules produce the same valuation of all entities. Formally, assumingeach thread has its view limited to the entities it has read, we canshow that no matter what the domain of the entries are, and whatfunctions the individual threads may compute, and no matter howmany local variables a thread may have, the final values of bothlocal variables and global entities remains unchanged when exe-cuting two equivalent schedules. The above notion is called view

serializability which treats threads in programs as executing unin-

terpreted functions of what they view, and conflict serializability(or atomicity) implies view serializability [15].

An alternative way to define equivalence is to take the condition inDefinition 3 above as a relation and take schedule equivalence tobe the reflexive and transitive closure of that relation. Yet anotherequivalent definition is to define σ equivalent to σ′ provided, forevery two dependent letters a, b ∈ Σ, σ|{a,b} = σ′|{a,b}.

The ideal way to view a schedule σ is actually as a Mazurkiewicztrace [6], where each individual letter in σ is an event, and thereis a partial order (called the causality order) that orders σ[i] beforeσ[j] if and only if i < j and σ[i]Dσ[j]. When viewed as such apartial order, σ′ is equivalent to σ iff σ′ is a linearization of thetrace defined by σ.

We call a schedule σ serial if all the transactions in it occur atomi-cally: formally, for every i, if σ[i] = T :a where T ∈ T and a ∈ A,then there is some j < i such that T [i] = T :B and every j < j′ < iis such that σ[j′] ∈ ΣT . In other words, the schedule is made upof a sequence of complete transactions from different threads, in-terleaved at boundaries only (the final transactions of a thread maybe incomplete, but even then the actions in each incomplete trans-action must occur atomically).

Definition 4 A schedule is serializable if it has an equivalent serialschedule. That is, σ is a serializable schedule if there a serial

schedule σ′ such that σ ∼ σ′.

Example 5 The schedule

beginT1T1:B T1:read(x) T1:write(y) beginT2

T2:B

T2:read(z) T1:write(z) T2:write(x) T1:C

T2:C endT2 endT1

is not serializable. The pair of dependent events (T1:read(x), T2:write(x))indicate that T2 has to be executed after T1 in a serial run, whilethe pair of conflicting events (T2:read(z), T1:write(z)) impose theopposite order. Therefore, no equivalent serial run can exist. The

following schedule however

beginT1T1:B T1:read(x) T1:write(y) beginT2

T2:B

T2:read(z) T1:write(z) T2:read(x) T1:C

T2:C endT2 endT1

is serializable since in an equivalent serial run one can run T1

followed by T2.

The conflict-graph characterization

A simple characterization of atomic (or conflict-serializable) sched-ules uses the notion of a conflict-graph, and is a classic character-ization from the database literature (this notion is so common thatmany papers in database theory define conflict serializability usingit).

In order for a schedule to be atomic, we must be able to find anequivalent serial schedule. The order in which events occur posesconstraints on how these transactions can be scheduled in the serialschedule. For example, if a transaction tr of thread T reads x andis followed by a transaction tr′ of thread T ′ that writes x, thenwe must schedule the (entire) transaction T before the (entire)transaction T ′. The ordering of events in a schedule can hence beseen to impose ordering constraints among transactions, and theschedule is serializable if we can find a linear order that respectsthese constraints. The conflict graph [10, 15, 4] is a graph made upof transactions as nodes, and edges capturing ordering constraintsimposed by a schedule. A schedule is serializable iff its conflictgraph represents a partial order (i.e. is acyclic).

Formally, for any schedule σ, let us give names to transactions inσ, say tr1, . . . , trn. The conflict-graph of σ is the graph CG(σ) =(V,E) where V = {tr1, . . . , trn} and E contains an edge fromtr to tr′ iff there is some event a in transaction tr and some actiona′ in transaction tr′ such that (1) the a-event occurs before a′ in σ,and (2) aDa′.

Note that transactions of the same thread are always ordered in theorder they occur (since all actions of a thread are dependent on eachother).

Lemma 6 A schedule σ is atomic iff the conflict graph associatedwith σ is acyclic.

The above lemma essentially follows from [10] (see also [4, 15]).The classic model is however a database model where transactionsare independent; we require an extension of this lemma to ourmodel where there are additional constraints imposed by the factthat some of them are executed by the same thread. It actually turnsout that the above characterization holds for transactions wherethe underlying alphabet has any arbitrary dependence relation, andhence holds in our threaded model since we have ensured that anytwo events of a thread are dependent.

If the conflict graph is acyclic, then it can be viewed as a partialorder, and it is clear that any linearization of the partial order willdefine a serial schedule equivalent to σ. If the conflict-graph has acycle, then it is easy to show that the cyclic dependency rules outserialization.

4 2007/7/17

Page 5: Algorithms for Atomicity

The above characterization yields a simple algorithm for serializ-ability that simply constructs the conflict-graph and checks for cy-cles. The algorithm can process the schedule event by event, creat-ing the graph, and finally call a cycle-detection routine. Hence:

Proposition 7 The problem of checking whether a singe scheduleσ is atomic is decidable in polynomial time.

3. Monitoring Serializability

The goal of this section is to build a space-efficient monitoringalgorithm for checking serializability violations. The algorithm webuild will at any point of time, after reading a run r, will takespace at most polynomial in the number of active threads after rand the number of entities accessed in r; otherwise, it will have nodependence on the length of r itself.

Since the conflict graph contains one node for every transaction thathas ever happened in the system, it can grow arbitrarily large, anddoes not result in the monitoring algorithm we seek.

Let us look at an example to see how to reduce the conflict graph.Consider the following non-serializable schedule:

T1:B T1:read(x) T2:BT2:write(x) T2:read(z)

T3:B T3:write(z) T2:C T1:write(x)

T1

T2

T3

The figure on the right demonstrates the con-flict graph for the above schedule without thelast event T1:write(x).

Now, in this graph, since the transaction/threadT2 has finished, it is tempting to remove the node T2, and summa-rize its effect by replacing it by an edge from T1 to T3. However,this is a serious error: for example, if we deleted T2, then the nextevent T1:write(x) does not cause a cycle. The reason is that thoughT2 is a completed transaction, it may still participate in later cyclesas outgoing edges from T2 can always be introduced even after T2

has completed.

The only other work we know that considers precise algorithmsand complexity for checking serializability of concurrent programsis a work by Alur, McMillan and Peled [2]. In this work, the abovegrave error is present, and the algorithms establishing upper boundsof checking serializability in the paper are wrong.

In fact, the problem of when completed transactions can be deletedis a well studied problem in database control, and there have beenseveral approaches to finding deletion policies [11, 4, 5]. Intu-itively, we need to keep the minimal amount of information that isnecessary and sufficient to detect non-serializability in the future.

In particularly, in the conflict graph, there are certain completedtransaction nodes that can be deleted without loss of any essentialinformation. Such deletions are called safe deletions [11]. A nec-essary and sufficient condition to safely delete a transaction node tis [11]:

(C): For every active predecessor a of t, and every entity xaccessed in t, there exists a successor t′ of a which accesses

x at least as strongly as t.

where a write(x) is a stronger access than a read(x).

When a safe deletion is performed, all the predecessors of thedeleted node are connected to all its successors to maintain the con-nectivity of the graph. By performing safe deletions on a conflictgraph, one gets a reduced conflict graph.

Remark 8 Condition (C) is necessary and sufficient for the safedeletion of transactions. At any given point one may have severalchoices which may lead to reduced graphs of different sizes. Deter-mining the best choice is an intractable problem. We note however,that if the number of active transactions is bounded, then any irre-ducible graph (graph from which no transaction can be removed)has also bounded size. To see why, associate with every completedtransaction t, in the graph the set of pairs (t, x) that witness the factthat t does not satisfy condition (C), and therefore cannot be safelydeleted. It is easy to see that no two completed transactions in thegraph have a common witness. Therefore, the number of nodes inthe graph cannot be more than m.|X | where m is the number ofactive transactions.

In the next section, we present a way to summarize the essential in-formation of the conflict-graph using just a graph of nodes formedby active transactions, but annotated with sets that summarize theeffect of completed transactions.

3.1 Summarized Conflict Graph for Serializability

The following notion of summarized conflict graphs is adaptedfrom a conflict-serializable scheduling algorithm in [5]. Intuitively,we keep the conflict graph restricted to edges between the activetransactions only (paths between active transactions summarizedas edges), and also keep with every active transaction/thread T aset C which denotes the set of events that occurred in transactions

scheduled later than the current transaction of T .

Definition 9 The summarized conflict graph of a schedule σ is agraph Gσ = (V,E) with two labeling functions S,C : V −→(2Σ ∪ 2A) where

• V contains a node vi for each active thread1 Ti, and

• E contains and edge from v to v′ (respectively associated tothreads T and T ′) if and only if there exists (T :a, T :b) ∈(S(v) ∪ C(v))×S(v′) such that (T :a, T ′:b) ∈ DΣ, and T :a

σ

T ′:b2.

• S andC sets are respective called the schedule and conflict sets.

where T :aσ T ′:b if and only if there exist i, j such that σ[i] =

T :a, σ[j] = T :b, and i < j.

The following set of rules show how the summarized conflict graphcan be constructed dynamically as the schedule σ progresses. The

1 An active thread T is a thread for where the beginT action has appeared

in the schedule, and endT has not appeared yet .2 Note that by definition, one can have pairs of the form (a, T:b), (T:a, b),

and (a, b) as well, but for brevity, we do not go over all cases

5 2007/7/17

Page 6: Algorithms for Atomicity

dynamic algorithm keeps updating the graph based on these rulesuntil a cycle is created in the graph that is when the algorithm stopsand reports a serializability violation. The algorithm maintains a setAT of the currently active threads.

• (Rule 1): If the next action in σ is Ti:B, then create a new nodevi and set S(vi) = ∅ and γ(vi) = ∅.

• (Rule 2): If the next action in σ is Ti:C, then remove vi by con-necting all its (immediate) predecessors to all its (immediate)successors. Also for every (immediate) predecessor vk of vi,set γ(vk) = γ(vk) ∪ S(vi) ∪ γ(vi).

• (Rule 3): if the next action in σ is Ti:a, then set S(vi) =S(vi) ∪ {Ti:a}. For all vk, if there is an action b ∈ S(vk) ∪γ(vk) such that (Tk:b, Ti:a) ∈ DΣ, add an edge (vk, vi) to E.

• (Rule 4): If the next action in σ is beginTi, then set AT =

AT ∪ {Ti}.• (Rule 5): If the next action in σ is endTi , then set AT =AT −{Ti}, and replace every Ti:a in every conflict set C by a.

Note that in the summarized conflict graph, the schedule and con-flict sets are keeping track of the witnesses that we discussed inRemark 8.

Example 10 Figure 1 demonstrates the conflict graph for theschedule σ′ as it progresses.

σ′ = T :! T :read(x)

σ′ = T :! T :read(x) T ′ :!T ′ :write(x) T ′ :read(z)

σ′ = T :! T :read(x) T ′ :!T ′ :write(x) T ′ :read(z) T ′ :"σ′ = T :! T :read(x) T ′ :!T ′ :write(x) T ′ :read(z) T ′ :"T ′ :! T ′ :read(y) T :write(y)

T

T ′

S = {read(x)}

S = {write(x), read(z)}

C = {write(x), read(z)}

S = {read(y)}λ = {read(x), write(y)}

TS = {read(x)}

TS = {read(x)}

C = {write(x), read(z)} T T ′

C = ∅

C = ∅C = ∅

C = ∅

Figure 1. Summarized Conflict Graph

The summarized conflict graph here is a version of the one pre-sented in [5, 4] adapted to handle threads, as threads are not con-sidered in the database framework. To understand why the summa-rized graph can be used to check for serializability in the presenceof threads, recall that such a graph works for any dependence rela-tion, and not just the specific one which is usually used. We exploitthis fact to our advantage, by making all the actions belonging tothe same thread dependent on each other. This way, there are al-ways edges form the active transactions which have the actions ofprevious transactions of a thread in their conflict sets, to the mostrecent transaction of each thread. Also, when a thread terminates,an event T :a can be renamed to simply a as the only way a laterevent can depend on T :a is to be dependent on a. This reducesthe space requirements and keeps it dependent on the set of activetransactions only.

Assume CGσ is the conflict graph [15] of σ. The following lemmademonstrates the relation between the summarized conflict graphand the conflict graph:

Lemma 11 In a schedule σ, where ti is the active transaction ofthread Ti and tj is the active transaction of thread Tj , then there

is an edge in Gσ from vi (node corresponding to Ti) to vj (nodecorresponding to Tj) if and only if in CGσ , there is a path from ui(node corresponding to ti in CGσ) to uj (node corresponding to tjin CGσ).

Proof. See the Appendix for the proof. �

Theorem 12 A schedule σ is serializable if and only if the summa-rized conflict graph Gσ is acyclic.

Proof. If the schedule is not serializable, then by Lemma 6, CGσcontains a cycle. Such cycle, by definition, always involves at leastone active transaction node. Therefore, by Lemma ??, it is clearthat if Gσ also has a cycle.

If Gσ has a cycle, by Lemma ??, CGσ has a cycle which byLemma 6 implies that the schedule is not serializable. �

If the number of active threads at some point is k, then the numberof nodes in the summarized conflict graph is bounded by k. If thenumber of entities accessed by the run is x, then for each node vi,the size of the sets S(vi) andC(vi) is bounded by x, and therefore:

Proposition 13 The size of the summarized conflict graph after arun r is bounded by k.x, where k is the number of active threads inr and x is the number of entities accessed by it.

4. Model Checking Atomicity for Concurrent

Skeletons

In this section, we present model-checking algorithms for checkingatomicity of finite-state concurrent programs with a fixed numberof threads.

We first embark on a study of various restricted programs wherethe global variables do not encode any data, in order to discover theessentials of detecting non-serializability in programs.

We begin with straight-line programs and show that they arepolynomial-time checkable for atomicity. This is achieved by re-ducing it the existence of a particular kind of schedule, called anormal run.

Then we consider branching programs, and establish that atomicitychecking is NP-complete, using a new notion of witness from eachthread called a profile. A profile sums up the effective informationof how a run of a thread contributes to proving non-serializability.We then extend the notion of profiles to handle branching programsthat iteratively generate transactions, and show that atomicity re-mains NP-complete for them.

The notion of extended profiles results in the first main result ofthis section: concurrent programs (without global data) where eachprogram is recursive has a decidable model-checking problem foratomicity. We show this by compiling every recursive programinto a regular program by extracting the extended profiles of therecursive programs using automata-theory. The fact that a globalproperty such as serializability is decidable for concurrent recursiveprograms is indeed surprising.

6 2007/7/17

Page 7: Algorithms for Atomicity

4.1 Concurrent Programs

For the results in this section, we fix a finite set of threads T ={T1, . . . , Tn}, a finite set of entities X = {x1, . . . , xm}. We alsorefer to schedules as runs.

We consider three frameworks based on the structure of code in thethreads. In each case we specify the type of code that threads canexecute by specifying the language of events L(T ) each individualthread T can execute.

• Straight-line programs: Each thread is a single transaction,i.e. the language of each thread T is a single transaction:L(T ) = {tr}, where tr ∈ TranT .

• Branching programs: Each thread is a language of singlethreads, i.e. L(T ) ⊆ TranT .

• Branching programs with consecutive transactions: This isthe most general model, where we allow each thread to exe-cute any number of transactions consecutively. More precisely,L(T ) is a subset of TranT

Models of programs will be represented using finite-state transitionsystems. For each thread Ti, the transactions generated by Ti is

modeled by a transition system Ai = (Qi, qiin ,→i) where Qi

is a finite set of states, qiin ∈ Qi is the initial state, and →i⊆Qi × ΣT ×Qi is the transition relation, which is a set of directedlabeled edges between states. The language of Ai, Li, is the set ofall words u ∈ Σ∗T on which there is a path from qin on u. Notethat any program with local variables spanning finite domains canbe modeled as a transition system of the above kind.

In all the models, if Li is the language of each thread Ti, thenthe language of runs generated by the system is R = {σ |σ is a run and ∀T : σ|Ti ∈ Li}. Given a program, the problemwe tackle is to check whether every run σ in R is serializable.

We will consider three variants of the above model: (1) regular:there is no data and no recursion, (2) recursive: there is still nodata, but we allow recursion, and (3) boolean: we assume booleanglobal variables.

In the following sections we look at the problem of atomicity of aprogram. A program is called atomic if and only if all the runs ofthe program are serializable. We look at the complexity of checkingnon-atomicity of a program for the different interesting variants oftypes of programs defined above.

4.2 Regular Straight-line Programs

Consider a set of threads running a single transaction each. Everythread Ti runs a transaction tr i = Ti : BwiTi : C. From earlierresults in Section , it is easy to see that for any run σ, checkingif σ is serializable can be checked in polynomial time. Note, how-ever, that there can be exponentially many runs corresponding tointerleavings of the transactions tr i.

Let us first show that if there is a non-serializable run, then thereis a non-serializable run of a particular form. A run σ is said to benormal if there is some set of threads Tj1 , . . . , Tjm and a threadTi such that σ = Ti:B · ui · tr j1 · tr j2 · · · tr jm · vi · Ti:C, where

wi = ui · vi. In other words, a run is normal if it executes partof a transaction in one thread, executes transactions in some otherthreads serially and completely, and then finishes the incompletethread.

Example 14 Figure 2 demonstrates a program consisting of threethreads, each running one transaction. There are two equivalent

runs, R1 and R2 of the program, where R1 is normal while R2

is not.

T1 T2 T3

write(x) write(y) write(z)read(z) read(x) read(y)

R1 : write(x) write(y) read(x) write(z) read(y) read(z)

R2 : write(x) write(y) write(z) read(x) read(y) read(z)

Figure 2. Normal Run.

The following is an observation whose variants will prove usefulthroughout this section:

Lemma 15 If a straight-line program with no locks has a non-serializable run, then it has a non-serializable normal run.

Proof. Assume σ is a non-serializable run. Consider the conflictgraph associated with σ, which must have a cycle. Let the cycle in-volve transactions tr0, . . . trn in that order (rename them to matchthe indices). In the construction of the conflict graph for σ, as-sume without loss of generality, that the first edge in the cycle tobe thrown in was from tr0 to tr1, and let tr0[k] be an event intr [0] that caused the edge. Let |tr0| = m. Note that there mustbe another event tr0[k′], k < k′ < m, that causes the incom-ing edge from trn to tr0. Now, it is easy to see that the scheduletr0[1, k].tr1. . . . trntr0[k+1,m] is a normal schedule that is non-serializable. �

For a straight-line program, note that the size of the program isthe sum of the lengths of the transactions in each thread. We nowconstruct a polynomial-time algorithm to check serializability ofstraight-line programs without locks. This algorithm constructs anundirected graph based on the program.

For a program P in the straight-line model with no locks, definegraph GP = (V,E, S) as follows:

• V = {v1,1, . . . , vn,m} has a node vi,j for jth of thread Ti, andS(vi,j) returns the action.

• E contains the set of edges {(vi,j , vi,k)| 1 ≤ k, j ≤ m}, im-plying that there is an edge between every two nodes corre-sponding to the actions of the same transaction. Let us call thistype as blue edges.

• For each action ak in Ti, and al in Tj such that (ak, al) ∈ D,we throw an edge (vi,k, vj,l) in E. Let us call this type as red

edges.

We can show that P has a non-serializable run if and only ifGP hasa cycle that contains at least one red edge and at least one blue edge.

7 2007/7/17

Page 8: Algorithms for Atomicity

Intuitively, such a cycle corresponds to a normal non-serializablerun, where we start with the thread containing a blue edge, executeit till we reach a node incident on a vertex that has both a blueedge and a red edge incident on it in the cycle, then execute thethreads corresponding to red edges completely, and finally finishthe thread we started with. Figure 3 revisits the program of Example14 by demonstrating the blue and red edges. The normal run R1

corresponds to the blue edge (write(x), read(z)).

T1 T2 T3

write(x) write(y) write(z)

read(z) read(x) read(y)

R1 : write(x) write(y) read(x) write(z) read(y) read(z)

Figure 3. Graph GP .

Finding a cycle with at least one blue edge and at least one red edgein Gp can be efficiently solved. For instance, a naive algorithm thatruns in polynomial time in the size of Gp goes as follows:

1. Pick a blue edge (vi,j , vi,k).

2. Remove (vi,j , vi,k) and every other blue edge in the sametransaction (i.e. (vi,p, vi,q) where 1 ≤ p, q ≤ m) from thegraph.

3. Check whether there is a path (of any kind) between the twonodes vi,j and vi,k.

4. If yes, report a cycle, and if not, go to (1).

Theorem 16 The problem of checking non-serializability of straight-line programs without locks is solvable in polynomial time.

Remark 17 A corollary of the above result (namely, by the factthat the graph in the above algorithm is undirected) is that the rel-ative order of events in each transaction is immaterial in determin-ing atomicity of the program (note that this is not true for particularruns; the sequencing of events that occur in a particular run is cru-cial in determining whether it is serializable). More precisely, con-sider a straight-line program P with no locks where each thread Tiexecutes the transaction Ti:B wi Ti:C. Consider another programP ′ with the same number of threads, and where each Ti executesthe transaction Ti:B w′i Ti:C where each w′i is a permutation ofthe events of wi. Then P has a non-serializable run iff P ′ has anon-serializable run.

4.3 Regular Branching Programs

Let us now consider regular branching programs, where each threadTi is represented by a transition system Ai. Note that each threadcould produce a infinite number of transactions, though it producesonly one transaction in any run.

We now show that checking non-serializability of a program in thebranching model with no locks is NP-Complete. The upper boundis based on the following observation.

If a system has a non-serializable run, then it is clear that there is anormal non-serializable run as well (see Lemma 15). Assume that σ

is a normal non-serializable run, and creates a cycle between trans-actions in the conflict-graph, through the threads T1, T2, . . . Tm, T1

in that order. Assume that σ starts executing T1, interrupts it to ex-ecute T2 through Tn completely and serially, and then finishes T1.

The crucial observation is that there are really only two eventsin each thread that contribute to evidencing the cycle in the con-flict graph, and hence evidencing non-serializability. Intuitively, foreach transaction tr i of thread Ti in the cycle, we pick two eventsini and out i, that cause the incoming edge from Ti−1 to Ti andthe outgoing edge from Ti to Ti+1. This observation is capturedformally below:

Lemma 18 Let P be a straight-line program where each thread Tiexecutes the transaction Ti:B wi Ti:C. Then P is non-serializableiff there exists words ui, where each ui is at most two letters, eachui is a subsequence ofwi, such that the straight-line program whereeach thread Ti executes Ti:B ui Ti:C is non-serializable.

The above lemma is very important, as it says that no matterhow long or complex a transaction is, the occurrences of singleevents and pairs of events in the transaction are all that matterin determining non-serializability. Combined with Remark 17, itfollows that even for a pair of events occurring in a transaction,the relative order does not matter. This motivates the followingdefinition of profiles.

Definition 19 (Profiles) For any transaction T :B wi T :C of athread T , a profile of the transaction is a word p, where p ∈ Σ∗T ,|p| ≤ 2, and either p or the reverse of p is a subsequence of w.

Example 20 Consider the transaction T :Bread(x)write(y)write(x)T :C.Strings write(y)write(x), write(x)read(x) and read(x) are profilesfor this transaction, while read(x)read(x) is not.

The problem of finding a branching program has a non-serializablerun can be established in NP-time. The idea is to guess a profile foreach thread, say pi for thread Ti, and then check if (a) the profilestreated as individual transactions corresponds to a straight-line pro-gram that has some non-serializable schedule, and (b) whether eachprofile pi is the profile of some transaction that can be generated bythe transition system of thread Ti. It is clear that profiles constitutea polynomial guess, and checking (a), by Theorem 16, is solvablein polynomial time. As for (b), it is easy to check if a transitionsystem generates a subsequence p or pr (reverse of p). Correctnessof this procedure follows from Lemma 18. We can in fact show:

Theorem 21 The problem of checking whether a branching pro-gram without locks is non-serializable is NP-complete.

Proof. Membership in NP follows from the argument and nondeter-ministic algorithm above. We show NP-hardness by reducing theproblem of finding a hamiltonian cycle in a graph to the problemof checking whether a branching program has an non-serializablerun.

Consider a graph G = (V,E), the goal is to check whether Gcontains a hamiltonian cycle.

For graph G = (V,E) with V = {v1, . . . , vn}, the program PGassociated with G is defined as follows:

8 2007/7/17

Page 9: Algorithms for Atomicity

• P consists of a set of n threads {T1, . . . , Tn} each running asingle transaction in parallel, one per each node vi.

• Define the set of entities to be X = {xi,j | 1 ≤ i, j ≤ n}.• Each thread Ti associated with node vi (2 ≤ i ≤ n) executes

the following piece of code:[1≤k≤n

[(vi,vj)∈E

{write(xk,i); write(xk+1,j)}

where n + 1 = 1. There is a special case for thread T1 whichexecutes: [

1≤k≤n

[(v1,vj)∈E

{write(xk+1,j); write(xk,1)}

in which the order of the two writes is changed. Note thatall these pairs of writes are in conflict with each other (asthe branching model suggests) and the thread Ti will end upexecuting only one of them.

We can show thatG contains a Hamiltonian cycle if and only if PGhas a non-serializable run.

4.4 Regular Programs with Consecutive Transactions

The case of each thread executing consecutive transactions (andbranching) is not much more difficult than the case of branchingprograms. The challenge here is to handle the fact that in a non-serializable schedule, there may be unbounded number of transac-tions as part of execution of some thread, as threads can execute anarbitrary large number of them.

Consider a non-serializable schedule, and the conflict graph gener-ated by it, which has a cycle. The crucial point to note is that sinceall events of a thread are dependent, there is really no need for thecycle to return to a thread T once it has left it. More precisely, con-sider a non-serializable schedule that generates a cycle of transac-tions say tr1, tr2, . . . trm, tr1. Assume that along this cycle thereare three transactions tr i, tr j and trk, where i 6= 1, i < j < k,and tr i and trk belong to the same thread while tr j belongs to adifferent thread. Then it is easy to see that we can “short-circuit”the cycle from tr i to trk, since the actions in them are dependent,and hence the conflict graph will have an edge between them.

This motivates the following notion of an extended profile:

Definition 22 An extended profile for a sequence of transactionsof a thread T :B w1 T :C T :B w2 T :C . . . T :B wn T :C is a world πwhich is of one of the following forms

• π = T :a, where T : a is one of the letters in w1 . . . wn, or

• π = T :a T :b , where T :a is a subsequence of wi, for some wi(note that the actions cannot come from different transactions),or

• π = T :a T :C T :B T :b, where π is a subsequence of thesequence of transactions (in other words, T :a comes from onetransaction and T :b comes from a strictly later transaction).

For an extended profile π, let the transaction sequence corre-sponding to π be Tran(π) = T :B π T :C. Note that if π =T :a T :C T :B T :b, then Tran(π) is a pair of transactions.

Example 23 Consider the thread T :B read(x) write(y) T :C T :B read(z) write(w) T :C.Strings write(y), read(z)write(w) and read(x)T :C T :B write(w)are extended profiles for this thread.

We can then show that using one extended profile of some sequenceof transactions generated for each thread is sufficient to build runsto check for non-serializability. The idea is similar to the branch-ing program case, only that if a thread has an extended profilee T :C T :B f (e and f occur in different transactions), then we cre-ate a thread which executes a transaction with only the event e in it,followed by a transaction with only the event f in it. Note that thisnotion of extended profile helps us shortcut through an arbitrarylarge number of consecutive transactions that may exist betweenevents e and f in the non-serializable run. This could make the sizeof the witness for non-serializability arbitrarily large which is notdesired.

Theorem 24 The problem of checking whether there is some non-serializable run of a branching program with consecutive transac-tions, but without locks, is NP-complete.

Proof. The lower bound for the above follows from Theorem 21;this is a more general case of branching programs, and thereforesolving it should be at least as hard. For the membership in NP-time, the following argument holds: one can guess the extendedprofiles π1, ...πn for each thread. There are two checks necessary toverify this as a witness for non-serializability: (a) for all i whetherπi is actually an extended profile of thread Ti, and (b) whether aprogram where each thread Ti is replaced by Tran(πi) is serializ-able. It is clear that (a) for each thread can be done in polynomialtime on the size of the thread. For (b), the cases that Trans(πi)is only one transaction are left as they are. For the cases whereTrans(πi) consists of two transactions, one can create a new threadT ′i , have the first transaction to be the only transaction of Ti andthe second one to be the only transaction of T ′i . One would haveat most 2n threads. This now becomes a case of straight-line pro-grams without locks (with 2n threads), for which we already arguedin Theorem 16 that there exists a polynomial time solution. �

4.5 Handling Recursion

In this section, we discuss the effect of presence of recursion inthe code on the serializability checking problem. We again con-sider the most general problem, i.e. branching time programs with-out locks and with recursion that iteratively generate consecutivetransactions. In fact, we show that checking serializability for suchprograms is decidable and it can be done it the same time complex-ity of the case without recursion. The result holds under the reason-able assumption that recursion does not extend beyond boundariesof transactions; i.e. a recursive call has to return within the sametransaction that it was called.

To incorporate recursion into the framework that we already have,we have to go beyond a regular languages to model thread execu-tions. One needs a context-free language to do this. We assume thatthe thread is given as a PDA (Push Down Automata) which acceptsall the possible runs of the thread.

9 2007/7/17

Page 10: Algorithms for Atomicity

The basis for this result comes from the extended profiles ofthreads. The witness for non-serializability need only contain theextended profile of each thread. The rest of the actions in the threadare irrelevant to the existence and validity of this non-serializabilitywitness. Therefore, one can intuitively, replace each thread by itsextended profile and have the same serializability results.

Extracting the extended profiles from non-recursive threads is arather straightforward task. This does not hold for recursive threads.To do this, we show that for any PDA P , one can efficiently

construct an NFA N , such that the set of extended profiles of Pand N are the same. Therefore, we can replace the PDA model(the recursive code) of a thread by an DFA (a non-recursive code),effectively removing recursion, and keep the serializability statusthe same.

Lemma 25 For a PDA P , the set of extended profiles generated byP can be computed in polynomial time.

Proof. Let us explain how extended profiles are derived from PDAs.Let A = (Q,Σ,Γ, δ, q0, F ) be a PDA in which Q is the set ofstates, Σ the input alphabet and Γ the stack alphabet, δ the transitionfunction, q0 ∈ Q the starting state and F ⊆ Q the set of final states.The transition function δ maps a triple (q, a, γ) ∈ Q × Σ × Γinto a set of pairs (q′, s) where q′ ∈ Q and s ∈ Γ∗. A transitionfrom (q, a, γ) into (q′, s) corresponds to being in state q, observingsymbols a on the input, observing γ on the top of the stack, andmoving to state q′ while popping γ from the stack and pushing thestring s ∈ Γ∗ onto the stack.

From A, construct a PDA B by adding an ε-transition for ev-ery transition in A, i.e. whenever (q′, s) ∈ δ(q, a, γ), we extendδ(q, ε, γ) to include (q′, s) as well. Clearly, we have

L(B) = {w | ∃u ∈ L(A) such that w is a subsequence of u}.

Now, let P = {w | w is an extended profile } be the set of wordsover Σ that corresponds to all possible extended profiles, i.e.

P = {T :a | T :a is an action} ∪{T :a T :b | T :a and T :b are actions } ∪{T :a T :C T :B T :b | T :a and T :b are actions}

P is regular and in fact accepted by a DFA, which we call C. Weconstruct C such that automata states are prefixes of all extendedprofiles; for each extended profile π, we have a final state qπ in C.

It’s clear that L(B) ∩ L(C) is the set of extended profiles of alltransactions generated byA. The problem is that the intersection ofa context-free language and a regular one is in general a context-free language (not a regular one), and therefore there is no way ofdirectly constructing an DFA that recognizes this language L(B)∩L(C) although we know it is regular.

Instead, we construct an NFA accepting all reachable configura-tions of L(B) ∩ L(C). From this, we enumerate the exact statesthat are reachable; if a configuration with profile π is reachablethen π belongs to L(B) ∩ L(C). Results from [7] support thefact that the above can be done in polynomial time using symbolictechniques, since the NFA accepting all reachable configurations ofL(B) ∩ L(C) can be constructed in polynomial time. � A nicefeature of using this approach is that if we have a boolean programas an input, Moped [1] can be used to represent the symbolic statesusing BDDs and compute the extended profiles set.

Remark 26 Regardless of the type of a program (recursive or not),one can always replace the language of each thread, but the set ofits extended profiles and maintains the same serializability results.If we consider other variant of recursive programs, that is straight-line, and branching, one can observe that using the extended pro-file, we can perform the check for the recursive program in the sametime complexity as the non-recursive case. The set of extended pro-files of a recursive straight-line program is a straight-line program,and the set of extended profiles of a recursive branching program isa branching program.

5. Atomicity Checking for Boolean Programs

In this section, we present the third main result of the paper: theproblem of checking atomicity of concurrent Boolean programs issolvable, and its exact complexity is PSPACE.

We consider programs with boolean variables. We consider hereonly the most general problem, i.e. branching programs with globalboolean variables that iteratively generate transactions.

A boolean program is over a finite set of global variables X , anda finite set of local boolean variables V . A boolean program ismodeled as a guarded language. Each rule is of the form g(~u) →(a, s) where g is a boolean formula over V , a is an action from{read(x , b),write(x , b)|b = true, false}, and s is set of assign-ments of the form u := b, where b = true, false}, modeling mul-tiple assignments of boolean values to local variables.

The semantics of a program at a thread is the natural one, andis defined by a transition system, but where the exact valuesread/written is projected away. The transition system is of the

form (Q, qin,)−→, where each state q ∈ Q is an evaluation

eval of global and local variables, and each transition is of

the form qa−→ q′, where a ∈ {read(x),write(x)}. The rule

g(~u) → (read(x , b), ~v = ~b′) is enabled at the state eval if andonly if eval(x) = b and g(eval(~u)) = true , and the new state

eval ′ after the transition will be eval [~v/~b′] where the local vari-ables in ~v are substituted with their assigned values. Similarly, the

rule g(~u) → (write(x , b), ~v = ~b′) applies to the state eval ifand only if g(eval(~u)) = true , and the new state eval ′ after the

transition will be eval [~v/~b′, x/b].

σ = a1a2 . . . am . . . is a run of this transition system if there are

transitions qina1−→ q1

a2−→ q2a3−→ . . .

am−→ qmqm+1−→ . . . in the

system. Therefore, a run of the system is over the same alphabet Σas previous sections. The size of such a transition system is the sumof size of all its rules. The size of a program that consist of finitelymany threads of the above form running concurrently will be thesum of sizes of each individual transition system.

This model is particularly interesting in an abstraction frame-workwhere data-domains are abstracted say using predicate abstrac-tion [3]. Note that programs with data can simulate locks.

Programs with data are, in general, much harder to check. Giventhe transition system for even one particular thread, say A, thereachability problem of checking whether A can reach a particularstate is in fact PSPACE-hard.

10 2007/7/17

Page 11: Algorithms for Atomicity

Lemma 27 Given a transition system A = (Q, qin ,)−→ that has

global boolean variables, the problem of checking whether a stateq ∈ Q is reachable by a run from qin to q is PSPACE-hard.

Proof. The proof is by showing a reduction from the membershipproblem for any linear-space Turing machine. Intuitively, given aninput word w of length n to the Turing machine, we will have, foreach pair (i, a), where i ≤ n is a cell-number and a is a tape-symbol, one boolean variables, b(i,a). The value l(i,a) is 1 if celli contains a, and is 0 if it doesn’t. Hence a tape-configuration ofthe machine can be captured using O(n) variables. By similarlyencoding the state and position of the head, we can build a systemthat simulates the Turing machine by assigning 0 or 1 to the booleanvariables, and reaches the target state q iff the machine halts on w.�

For the upper bound, we will show that programs with booleanvariables can be checked for serializability violations in PSPACE.

Monitoring algorithm as deterministic automata

The algorithm for checking serializability of programs with datarelies on the monitoring algorithm for serializability developed inthe previous section. Since the number of threads is finite (say tof them) and the number of entities is finite (say e of them), itfollows that the monitoring algorithm takes only a bounded amount

of space (in fact space O(t2 + t.e)). This stems from the fact thatthe space required is to store the graph of active transactions, andfor each transaction, store a set of read-events and write-events toentities. Hence we have the following:

Lemma 28 There is a finite automaton Ser over the alphabet Σ ofstate-space at most exponential in (t2 + t.e) such that for any runσ, σ ∈ L(Ser) iff σ is non-serializable.

Note that the automaton Ser does not check if its input is indeeda run (as this will require keeping track of data) However, givena valid run σ, it will accept it only if it is non-serializable. Theautomaton is an immediate implementation of the algorithm, whereeach state corresponds to a graph between the active transactionsand the event sets associated to each of them, and transitions reflectthe way the algorithm transforms the graph and the sets. Noticethat Ser is in fact a deterministic automaton. Also note that Seris an automaton that can be constructed effectively on-the-fly, i.e.given a state and a letter, we can compute the successor state of theautomaton on the letter in time polynomial in t and e.

We can now decide whether a program with data is serializableusing automata theory. First, let us model the entire system as aproduct system of its individual components. That is, we build anautomaton B whose state-space is the product of the state-spacesthe programs at each thread, as well as a vector of values that recordthe boolean values of the entities.

Construct the automaton Ser . It is straightforward to see that thesystem is atomic iff L(B) ∩ L(Ser) is empty. This is a graph

reachability problem on the product space of B and Ser , and fromthe fact that the product transitions are efficiently generable on-the-fly, and from the fact that the state-space is bounded by anexponential in t and e, it follows that the problem is solvable inNPSPACE, and hence in PSPACE:

Theorem 29 The problem of checking if a boolean program isserializable is PSPACE-complete. The lower bound of PSPACE-hardness holds even for boolean programs generating single trans-actions.

Remark 30 (Locks) The lower bound of reachability of Lemma 27holds even if the program is manipulating only locks. In fact,global boolean variables can encode locks and Theorem 29 is alsoapplicable for programs with locks.

6. Conclusion and Future Work

We have established the fundamental algorithms to monitor concur-rent programs for atomicity violations and model checking finite-state concurrent programs for atomicity. Using a graded analysisof increasingly complex models, we have shown the importanceof profiles of runs programs generate, and used it to show the firstalgorithms for model checking atomicity.

The monitoring algorithm opens up the possibility of building anaccurate and efficient tool for checking atomicity violations. Thesummarized graph has two sets for each transaction, the scheduledset S and the conflict set C. The former is bound by the lengthof the active transaction and likely to be small, and the lattersummarizes the events in completed transactions strictly after thecurrent active transaction. Clearly, in any reasonable schedule, thenumber of completed transactions that have to occur later thanactive transactions is bound to be small, and hence we believethat the monitoring algorithm will take little space in practice. Theproblem however is in maintaining the graph; changing the graphafter each event is likely to be too expensive. A monitoring toolimplementing the algorithm with heuristics to process the graph atappropriate intervals, and the accuracy the tools gives in practice,are worth investigation. We refer the reader to [18] where sucha monitoring tool for 2-phase locking, a sufficient condition foratomicity, has been developed.

Turning to model checking, the results in this paper suggest thatmodel checking concurrent programs may be tractable. Note how-ever that the algorithms require exploring the entire state-space, andis likely to be infeasible because of explosion of the state-space. Webelieve that developing compositional methods to verify atomicityneed to be studied, and the notion of profiles set forth in this papermay be a starting point in this investigation.

Finally, we have not adequately explored concurrency controlmechanisms such as locks. Extending the results here to settingswith locks is a natural open problem. It would also be interesting tosee whether recursive programs with nested locks yield decidableatomicity checking [13].

References

[1] Moped - a model-checker for pushdown systems,

http://www.fmi.uni-stuttgart.de/szs/tools/moped/.

[2] Rajeev Alur, Kenneth L. McMillan, and Doron Peled. Model-

checking of correctness conditions for concurrent objects. Infor-

mation and Computation, 160(1-2):167–188, 2000.

[3] Thomas Ball and Sriram K. Rajamani. The slam project: debugging

system software via static analysis. In POPL, pages 1–3, 2002.

11 2007/7/17

Page 12: Algorithms for Atomicity

[4] Philip A. Bernstein and Nathan Goodman. Concurrency control in

distributed database systems. ACM Comput. Surv., 13(2):185–221,

1981.

[5] Marco Antonio Casanova. Concurrency Control Problem for

Database Systems. Springer-Verlag New York, Inc., Secaucus, NJ,

USA, 1981.

[6] Volker Diekert and Grzegorz Rozenberg. The Book of Traces. World

Scientific Publishing Co., Inc., River Edge, NJ, USA, 1995.

[7] Javier Esparza, David Hansel, Peter Rossmanith, and Stefan

Schwoon. Efficient algorithms for model checking pushdown

systems. In Computer Aided Verification, pages 232–247, 2000.

[8] Cormac Flanagan and Stephen N Freund. Atomizer: a dynamic

atomicity checker for multithreaded programs. In POPL ’04:

Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on

Principles of programming languages, pages 256–267, New York,

NY, USA, 2004. ACM Press.

[9] Cormac Flanagan and Shaz Qadeer. A type and effect system for

atomicity. In PLDI ’03: Proceedings of the ACM SIGPLAN 2003

conference on Programming language design and implementation,

pages 338–349, New York, NY, USA, 2003. ACM Press.

[10] M. P. Fle and G. Roucairol. On serializability of iterated transactions.

In PODC ’82: Proceedings of the first ACM SIGACT-SIGOPS

symposium on Principles of distributed computing, pages 194–200,

New York, NY, USA, 1982. ACM Press.

[11] Thanasis Hadzilacos and Mihalis Yannakakis. Deleting completed

transactions. In PODS ’86: Proceedings of the fifth ACM SIGACT-

SIGMOD symposium on Principles of database systems, pages 43–46,

New York, NY, USA, 1986. ACM Press.

[12] John Hatcliff, Robby, and Matthew B. Dwyer. Verifying atomicity

specifications for concurrent object-oriented software using model-

checking. In VMCAI, pages 175–190, 2004.

[13] Vineet Kahlon, Franjo Ivancic, and Aarti Gupta. Reasoning about

threads communicating via locks. In CAV, pages 505–518, 2005.

[14] Richard J. Lipton. Reduction: a method of proving properties of

parallel programs. Commun. ACM, 18(12):717–721, 1975.

[15] Christos Papadimitriou. The theory of database concurrency control.

Computer Science Press, Inc., New York, NY, USA, 1986.

[16] Liqiang Wang and Scott D. Stoller. Accurate and efficient runtime

detection of atomicity errors in concurrent programs. In Proc. ACM

SIGPLAN 2006 Symposium on Principles and Practice of Parallel

Programming (PPoPP), pages 137–146. ACM Press, March 2006.

[17] Liqiang Wang and Scott D. Stoller. Runtime analysis of atomicity

for multi-threaded programs. IEEE Transactions on Software

Engineering, 32:93–110, February 2006.

[18] Min Xu, Rastislav Bodı́k, and Mark D. Hill. A serializability

violation detector for shared-memory server programs. SIGPLAN

Not., 40(6):1–14, 2005.

[19] R. H. B. Netzer and B. P. Miller. What are race conditions? some

issues and formalizations. LOPLAS, 1(1):74–88, 1992.

[20] S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. E. Anderson.

Eraser: A dynamic data race detector for multi-threaded programs. In

SOSP, pages 27–37, 1997.

[21] D. R. Engler and K. Ashcraft. Racerx: effective, static detection of

race conditions and deadlocks. In SOSP, pages 237–252, 2003.

[22] M. Naik, A. Aiken, and J. Whaley. Effective static race detection

for java. In M. I. Schwartzbach and T. Ball, editors, PLDI, pages

308–319. ACM, 2006.

[23] M. Naik and A. Aiken. Conditional must not aliasing for static race

detection. In M. Hofmann and M. Felleisen, editors, POPL, pages

327–338. ACM, 2007.

Appendix

Proof of Lemma 11

The proof is by induction on the length of σ. The base case for anempty σ is trivial. Assuming that it holds for a schedule σ, we showthat it will hold for σ Tk:a.

There are three possibilities for a:

• a is aB action: This case does not change the structure of eithergraph, so whatever that was true for σ, will remain true forσ Tk:B.

• a is a C action: This will compete the latest transaction ofTk. The corresponding node vk will be removed, and Gσ willchange according to rule 2. CGσ will not change much otherthan vk now becomes a completed node. The only new edgesintroduced in Gσ are the transitive edges added as the resultof deletion of vk. For every newly added edge (vi, vj), wehave in Gσ the edges (vi, vk) and (vk, vj . Which by inductionhypothesis will give us the paths ui . . . uk and uk . . . uj inCGσ , and therefore we have the path ui . . . uj in CGσ .

• a is an action other thanB orC: If a creates a new edge (vi, vk)in Gσ , it could be for one of the two reasons:

There is an action b, dependent on a, in S(vi) in which casethere will be an edge (ui, uk) in CGσ as well.

There is an action b, dependent on a, in γ(vi). It is easy(by observing Rule 2), to see that this means that there hasbeen a completed transaction once which is a successor of viwhich contained b, and that is why b has ended up in γ(vi.Therefore, in CGσ there is an edge from the node corre-sponding to that completed transaction u to uk, and also,by induction hypothesis u is a successor of ui (consider theprefix of σ in which u is just about to complete). Therefore,there is a path from ui to uk.

The reasoning in the other direction is very similar.

12 2007/7/17