deadlock detection

Deadlock Detection

Nov 26, 2012CS 8803 FPL

Part I

• Static Deadlock Detection

• Reference:

Effective Static Deadlock Detection [ICSE’09]

• An unintended condition in a shared-memory, multi-threaded program in which:– a set of threads blocks forever– because each thread in the set waits to acquire a

lock being held by another thread in the set• This work: ignore other causes (e.g., wait/notify)

• Example// Thread t1sync (l1) { sync (l2) { … }}

// Thread t2sync (l2) { sync (l1) { … }}

What is a Deadlock?

• Today’s concurrent programs are rife with deadlocks– 6,500/198,000 (~ 3%) of bug reports in Sun’s bug database at

http://bugs.sun.com are deadlocks

• Deadlocks are difficult to detect– Usually triggered non-deterministically, on specific thread

schedules– Fail-stop behavior not guaranteed (some threads may be

deadlocked while others continue to run)

• Fixing other concurrency bugs like races can introduce new deadlocks– Our past experience with reporting races: developers often

ask for deadlock checker

Motivation

• Based on finding cycles in program’s dynamic or static lock order graph

• Dynamic approaches– Inherently unsound– Inapplicable to open programs– Ineffective without sufficient test input data

• Static approaches– Type systems (e.g., Boyapati-Lee-Rinard OOPSLA’02)

• Annotation burden often significant– Model checking (e.g., SPIN)

• Does not currently scale beyond few KLOC– Dataflow analysis (e.g., Engler & Ashcraft SOSP’03;

Williams-Thies-Ernst ECOOP’05)• Scalable but highly imprecise

Previous Work

• Deadlock freedom is a complex property– can t1,t2 denote different threads?– can l1,l4 denote same lock?– can t1 acquire locks l1->l2?– some more …

l = abstract lock acq.

t = abstract thread

Challenges to Static Deadlock Detection

• Deadlock freedom is a complex property– can t1,t2 denote different threads?– can l1,l4 denote same lock?– can t1 acquire locks l1->l2?– some more …

Our Rationale

• Existing static deadlock checkers cannot check all conditions simultaneously and effectively

• But each condition can be checked separately and effectively using existing static analyses

Our Rationale

• Consider all candidate deadlocks in closed program• Check each of six necessary conditions for each candidate

to be a deadlock• Report candidates that satisfy all six conditions• Note: Finds only deadlocks involving 2 threads/locks

– Deadlocks involving > 2 threads/locks rare in practice

• ...

Our Approach

• may-reach(t1,l1,l2)?• may-alias(l1,l4)?

class LogManager {

static LogManager manager = new LogManager();

155: Hashtable loggers = new Hashtable();

280: sync boolean addLogger(Logger l) {

String name = l.getName();

if (!loggers.put(name, l))

return false;

// ensure l’s parents are instantiated

for (...) {

String pname = ...;

314: Logger.getLogger(pname);

return true;

420: sync Logger getLogger(String name) {

return (Logger) loggers.get(name);

class Logger {

226: static sync Logger getLogger(String name) {

LogManager lm = LogManager.manager;

228: Logger l = lm.getLogger(name);

if (l == null) {

l = new Logger(...);

231: lm.addLogger(l);

return l;

class Harness {

static void main(String[] args) {

11: new Thread() { void run() {

13: Logger.getLogger(...);

}}.start();

18: LogManager.manager.addLogger(...);

}}.start();

Example: jdk1.4 java.util.logging

*** Stack trace of thread <Harness.java:11>:LogManager.addLogger (LogManager.java:280) - this allocated at <LogManager.java:155> - waiting to lock {<LogManager.java:155>}Logger.getLogger (Logger.java:231) - holds lock {<Logger.java:0>}Harness$1.run (Harness.java:13)

*** Stack trace of thread <Harness.java:16>:Logger.getLogger (Logger.java:226) - waiting to lock {<Logger.java:0>}LogManager.addLogger (LogManager.java:314) - this allocated at <LogManager.java:155> - holds lock {<LogManager.java:155>}Harness$2.run (Harness.java:18)

Example Deadlock Report

• Six necessary conditions identified experimentally

• Checked using four incomplete but sound whole-program static analyses

1. Reachable2. Aliasing3. Escaping4. Parallel5. Non-reentrant6. Non-guarded

1. Call-graph analysis2. May-alias analysis3. Thread-escape analysis4. May-happen-in-parallel analysis

• Relatively language independent• Incomplete but sound checks}

}• Widely-used Java locking idioms• Incomplete and unsound checks

- sound needs must-alias analysis

Our Approach

• Property: In some execution:– can a thread abstracted by t1 reach l1– and after acquiring lock at l1, proceed to reach l2 while holding

that lock?– and similarly for t2, l3, l4

• Solution: Use call-graph analysis– k-object-sensitive [Milanova-Rountev-Ryder ISSTA’03]

Condition 1: Reachable

class LogManager {

return false;

for (...) {

String pname = ...;

return true;

class Logger {

if (l == null) {

return l;

class Harness {

}}.start();

• Property: In some execution:– can a lock acquired at l1 be the same as a lock acquired at l4?– and similarly for l2, l3

• Solution: Use may-alias analysis– k-object-sensitive [Milanova-Rountev-Ryder ISSTA’03]

Condition 2: Aliasing

class LogManager {

return false;

for (...) {

String pname = ...;

return true;

class Logger {

if (l == null) {

return l;

class Harness {

}}.start();

• Property: In some execution:– can a lock acquired at l1 be thread-shared?– and similarly for each of l2, l3, l4

• Solution: Use thread-escape analysis

Condition 3: Escaping

class LogManager {

return false;

for (...) {

String pname = ...;

return true;

class Logger {

if (l == null) {

return l;

class Harness {

}}.start();

• Property: In some execution:– can different threads abstracted by t1 and t2– simultaneously reach l2 and l4?

• Solution: Use may-happen-in-parallel analysis– Does not model full happens-before relation– Models only thread fork construct– Other conditions model other constructs

Condition 4: Parallel

class LogManager {

return false;

for (...) {

String pname = ...;

return true;

class Logger {

if (l == null) {

return l;

class Harness {

}}.start();

Benchmark LOC Classes Methods Syncs Timemoldyn 31,917 63 238 12 4m48smontecarlo 157,098 509 3447 190 7m53sraytracer 32,576 73 287 16 4m51stsp 154,288 495 3335 189 7m48ssor 32,247 57 208 5 4m48shedc 160,071 530 3552 204 21m15sweblech 184,098 656 4620 238 32m02sjspider 159,494 557 3595 205 15m34sjigsaw 154,584 497 3346 184 15m23sftp 180,904 642 4383 252 35m55sdbcp 168,018 536 3602 227 16m04scache4j 34,603 72 218 7 4m43slogging 167,923 563 3852 258 9m01scollections 38,961 124 712 55 5m42s

Benchmarks

Benchmark Deadlocks(0-cfa)

Deadlocks(k-obj.)

Lock type pairs (total)

Lock type pairs (real)

moldyn 0 0 0 0montecarlo 0 0 0 0raytracer 0 0 0 0tsp 0 0 0 0sor 0 0 0 0hedc 7,552 2,358 22 19weblech 4,969 794 22 19jspider 725 4 1 0jigsaw 23 18 3 3ftp 16,259 3,020 33 24dbcp 320 16 4 3cache4j 0 0 0 0logging 4,134 4,134 98 94collections 598 598 16 16

Experimental Results

Individual Analysis Contributions

• Novel approach to static deadlock detection for Java– Checks six necessary conditions for a deadlock– Uses four off-the-shelf static analyses

• Neither sound nor complete, but effective in practice– Applied to suite of 14 multi-threaded Java

programs comprising over 1.5 MLOC– Found all known deadlocks as well as previously unknown

ones, with few false alarms

Conclusion

Part II

• Dynamic Deadlock Detection

• Reference:

An Effective Dynamic Analysis Technique for Detecting Generalized Deadlocks [FSE’10]

Motivation

• Most previous deadlock detection work has focused on resource deadlocks

• Example

// Thread T1 // Thread T2 sync(L1) { sync(L2) { sync(L2) { sync(L1) { …. …. } } } }

Motivation

• Other kinds of deadlocks, e.g. communication deadlocks, are equally notorious

• Example

// Thread T1 // Thread T2 if (!b) { b = true;

sync(L) { sync(L) { L.wait(); L.notify(); } } }

if(!b)

wait L

b = true

notify L

b is initially false

• Build a dynamic analysis based tool that: – detects communication deadlocks– scales to large programs– has low false positive rate

Our Initial Effort

• Take cue from existing dynamic analyses for other concurrency errors

• Existing dynamic analyses check for violation of a programming idiom– Races:

• every shared variable is consistently protected by a lock– Resource deadlocks:

• no cycle in lock ordering graph– Atomicity violations:

• atomic blocks should have the pattern (R+B)*N(L+B)*

Our Initial Effort

• What programming idiom should we check for communication deadlocks?

Our Initial Effort

• Recommended usage of condition variables

// Thread T1 // Thread T2 sync (L) { sync (L) { while (!cond) cond = true; L.wait(); L.notifyAll();

assert (cond == true); } }

An Example

• Recommended usage of condition variables

// Thread T1 // Thread T2 sync (L) { sync (L) { while (list.isEmpty()) list.add(...); L.wait(); L.notifyAll();

… = list.remove(); } }

Violation of Idiom as Deadlock

• Example

// Thread T1 // Thread T2 if (!b) b = true;

sync (L) L.notifyAll();

sync (L) L.wait();

Must use while, not if

Accesses to b must be

inside sync

Satisfaction of Idiom as Deadlock

• Example

// Thread T1 // Thread T2 sync (L1) sync (L2) while (!b) L2.wait();

sync (L1) sync (L2)

L2.notifyAll();

=> Recommended usage pattern (or idiom) based checking does not work

No violation of idiom, but still

deadlocks!

Revisiting Existing Analyses

• Relax the dependencies between relevant events from different threads– verify all possible event orderings for errors– use data structures to check idioms (vector clocks, lock-

graphs etc.) to implicitly verify all event orderings

Revisiting Existing Analyses

• Programming idiom-based checking does not workfor communication deadlocks

• Nevertheless, we can explicitly verify all orderings of relevant events for deadlocks

Trace Program

// Thread T1 // Thread T2 if (!b) { b = true;

sync (L) { sync (L) {L.wait (); L.notify ();

} } } b is initially false

lock L

wait L

unlock L

lock L

unlock L

notify L

Trace Program

lock L

wait L

unlock L

lock L

unlock L

notify L

T1 T2 Thread T1 {lock L;wait L;unlock L;

Thread T2 {lock L;notify L;unlock L;

Trace Program

Thread T1 { Thread T2 {lock L; lock L;wait L; || notify L;unlock L; unlock L;

lock L

wait L

unlock L

lock L

unlock L

notify L

Trace Program

• Built out of only a subset of events– usually much smaller than the original program

• Throws away a lot of dependencies between threads– could give false positives– but increases coverage

// Thread T1 // Thread T2 if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false

lock L

wait L

unlock L

lock L

unlock Lnotify L

if (!b)

b = true

Trace Program: Add Dependencies

lock Lwait L

unlock L

lock L

unlock Lnotify L

if (!b)

b = true

Thread T1 { if (!b) {

lock L;wait L;unlock

L; }} Thread T2 {

b = true;lock L;notify L;unlock L;

Trace Program: Add Dependencies

Trace Program: Add Predictivity

• Use static analysis to add to the predictive power of the trace program

// Thread T1 // Thread T2 @ !b => L.wait() if (!b) { b = true; sync (L) { sync (L) {

L.wait (); L.notify (); } } } b is initially false

Thread T1 { if (!b) {

lock L;wait L;unlock L;

• Effective for concurrency errors that cannot be detected using a programming idiom– communication deadlocks, deadlocks due of exceptions, …

// Thread T1 // Thread T2 while (!b) { try {

sync (L) { foo(); L.wait(); b = true;

} sync (L) { L.notify(); } } } catch (Exception e) {…}

b is initially false

can throw anexception

Trace Program: Other Errors

• Implemented for deadlock detection– both communication and resource deadlocks

• Built a prototype tool for Java called CHECKMATE

• Applied to several Java libraries and applications– log4j, pool, felix, lucene, jgroups, jruby....

• Found both previously known and unknown deadlocks (17 in total)

Implementation and Evaluation

Conclusion

• CHECKMATE is a novel dynamic analysis for finding deadlocks– both resource and communication deadlocks

• Effective on several real-world Java benchmarks

• Trace program based approach is generic– can be applied to other errors, e.g. deadlocks because of

exceptions

Did Not Cover Today …

• Deadlock Detection in Message-Passing Programs– must model many variants of message sends/receives

• Dynamic Deadlock Avoidance– unique to deadlock errors (cannot, e.g., “avoid” buffer overruns)– see Dimmunix OSDI’08 paper (http://dimmunix.epfl.ch/)

• Dynamic Deadlock Detection by Controlling Thread Schedules– CHESS (http://research.microsoft.com/en-us/projects/chess/) – CalFuzzer (http://srl.cs.berkeley.edu/~ksen/calfuzzer/)

• Type-based Deadlock Detection– statically check lock-order graph (see OOPSLA’02 paper)

deadlock detection

Documents

faculty of engineering & technology...system model, deadlock...

a novel deadlock detection algorithm for neighbour...

deadlock. contents principles of deadlock deadlock...

distributed deadlock detection algorithm -...

distributed deadlock detection algorithm - badal

a framework for deadlock detection in java

1 process management deadlock –7 cases of deadlock...

module 7: deadlockswannarat/240-323/mod7.1.pdf · deadlock...

deadlock detection in distributed databases

distributed deadlock detection - umass

distributed deadlock detection

chapter 6 concurrency: deadlock and...

1 deadlocks chapter 3 topics resource deadlocks the ostrich...

static deadlock detection for java libraries

deadlock detection algorithms in distributed systems

a deterministic dynamic deadlock detection and recovery ·...

deadlock detection in distributed systems using the imds

deadlock detection in distributed...

static deadlock detection for java...

deadlock detection nov 26, 2012 cs 8803 fpl 1. part i static...