cooperative concurrency bug isolation guoliang jin, aditya thakur, ben liblit, shan lu university of...
TRANSCRIPT
1
Cooperative Concurrency Bug Isolation
Guoliang Jin, Aditya Thakur, Ben Liblit, Shan LuUniversity of Wisconsin–Madison
Instrumentation and Sampling Strategies
for
2
Cooperative Concurrency Bug Isolation
• They are synchronization mistakes in multi-threaded programs.
• Several types:– Atomicity violation– Data race– Deadlock, etc.
read(x)
read(x)
write(x)
thread 1 thread 2
JL
write(x)
read(x)
thread 1 thread 2
J?J?
3
Concurrency bugs are common in the fields
• Developers are poor at parallel programming• Interleaving testing is inefficient• Applications with concurrency bugs shipped to
the users
�ƒ€‚�
4
Concurrency bug lead to failures in the field
• Disasters in the past– Therac-25, Northeastern Blackout 2003
• More threats in multi-core era
‚
5
Failure diagnosis is critical
6
L
Concurrency Bug Failure Example
Concurrency Bug from Apache HTTP Server
7
…memcpy(&buf[idx], s, strlen(s));
…log_writer() {
…}…
thread 1
J
Concurrency Bug Failure Example
Concurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
idx
thread 2
…return SUCCESS;
…memcpy(&buf[idx], s, strlen(s));
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);…return SUCCESS;
8
…return SUCCESS;
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
…log_writer() {
…}…
thread 1
L
Concurrency Bug Failure Example
Concurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
idx
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
9
• The failure is non-deterministic and rare– Programmers have trouble to repeat the failure
• The root cause involves more than one thread
Diagnosing Concurrency Bug Failure is Challenging
10
Existing work and their limitations
• Failure replay– High runtime overhead– Developers need to manually locate faults
• Run-time bug detection– (mostly) High runtime overhead– Not guided by the failure• Many false positives How to achieve
low-overhead & accurate
failure diagnosis?
11
Predicates
Our work: CCI
�ƒƒ€‚�
Program
SourceCompiler
Counts& J/L
StatisticalDebugging
Predictors
Sampler
• Goal: diagnosing production run concurrency bug failures• Major components:– predicates instrumentor– sampler– statistical debugging
True in most failure runs, false in most correct runs.
12
CCI Overview• Three different types of predicates.• Each predicate has its supporting
sampling strategy.• Same statistical debugging as in CBI.• Experiments show CCI is effective in
diagnosing concurrency failures.
Capability
Ove
rhea
d
FunRe
Havoc
Prev
13
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
Outline
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
14
CCI-Prev Intuition
read(x)
read(x)
write(x)
J L
thread 1 thread 2
read(x)
read(x)
write(x)
thread 1 thread 2
read(x)
write(x)
J L
thread 1 thread 2
read(x)
write(x)
thread 1 thread 2
Atomicity Violation Data Race
Just record which thread accessed last time.
read(x) write(x)
read(x)
read(x)
read(x)
write(x) read(x)
15
CCI-Prev PredicateIt tracks whether two successive accesses to
a shared memory location were by two distinct threads or were by the same thread.
Capability
Ove
rhea
d Prev
16
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
…log_writer() {
…}…
thread 1
J
CCI-Prev Predicate on the Correct Run
Concurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);…return SUCCESS;
I
I
Predicate J L…
remoteI 0 0
localI 0 0
…
Predicate J L…
remoteI 0 0
localI 1 0
…
Predicate J L…
remoteI 0 0
localI 2 0
…
17
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
…return SUCCESS;
…log_writer() {
…}…
thread 1
L
CCI-Prev Predicate on the Failure Run
Concurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
I
I
Predicate J L…
remoteI 0 0
localI 2 0
…
Predicate J L…
remoteI 0 0
localI 2 1
…
Predicate J L…
remoteI 0 1
localI 2 1
…
Predicate J L…
remoteI 0 1
localI 2 1
…
Predicate J L…
remoteI 0 1
localI 2 1
…
18
…memcpy(&buf[idx], s, strlen(s));
…
…log_writer() {
…}…
thread 1
L
CCI-Prev Predicate Instrumentation
Concurrency Bug from Apache HTTP Server
temp = idx;
idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {…}…
Predicate J L…
remoteI 0 0
localI 2 1
…
Predicate J L…
remoteI 0 1
localI 2 1
…
Iunlock(glock);
remote = test_and_insert(& idx, curTid);record(I, remote);
lock(glock);a global hash table
address ThreadID
… …
& idx 2
… …
address ThreadID
… …
& idx 1
… …
address ThreadID
… …
& idx 1
… …
19
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
…return SUCCESS;
…log_writer() {
…}…
thread 1
CCI-Prev Sampling Strategy
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
Does traditional sampling work? NO.
• Thread-coordinated• Bursty
I
20
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
Outline
21
…memcpy(&buf[idx], s, strlen(s));
CCI-Havoc Intuition
Just record what value was observed during last access.
…memcpy(&buf[idx], s, strlen(s));
…return SUCCESS;
…log_writer() {
…}…
thread 1
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
I
22
CCI-Havoc PredicateIt tracks whether the value of a given shared location changes between two consecutive accesses by one thread.
Capability
Ove
rhea
d Prev
Havoc
Only uses thread local information
23
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
…log_writer() {
…}…
thread 1
J
CCI-Havoc Predicate on the Correct Run
Concurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);…return SUCCESS;
I
I
Predicate J L…
unchangedI 0 0
changedI 0 0
…
Predicate J L…
unchangedI 1 0
changedI 0 0
…
Predicate J L…
unchangedI 2 0
changedI 0 0
…
24
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
…return SUCCESS;
…log_writer() {
…}…
thread 1
L
CCI-Havoc Predicate on the Failure Run
Concurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
I
I
Predicate J L…
unchangedI 2 0
changedI 0 0
…
Predicate J L…
unchangedI 2 1
changedI 0 0
…
Predicate J L…
unchangedI 2 1
changedI 0 1
…
Predicate J L…
unchangedI 2 1
changedI 0 1
…
Predicate J L…
unchangedI 2 1
changedI 0 1
…
25
…memcpy(&buf[idx], s, strlen(s));
…log_writer() {
…}…
thread 1
L
CCI-Havoc Predicate Instrumentation
Concurrency Bug from Apache HTTP Server
… temp = idx;
idx = temp + strlen(s);
thread 2
…return SUCCESS;
Predicate J L…
unchangedI 2 1
changedI 0 0
…
Predicate J L…
unchangedI 2 1
changedI 0 1
…
…log_writer() {…}…
I
insert (& idx, temp);
changed = test(& idx, temp);record(I, changed);
hash table forthread1
address value
… …
& idx idx
… …
address value
… …
& idx idx+len2
… …
26
…memcpy(&buf[idx], s, strlen(s));
…return SUCCESS;
…log_writer() {
…}…
thread 1
CCI-Havoc Sampling Strategy
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
• Bursty• Thread-independent
…memcpy(&buf[idx], s, strlen(s));
27
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
Outline
28
CCI-FunRe PredicateIt tracks whether the execution of one function overlaps with the execution of the same function from a different thread.
Capability
Ove
rhea
d Prev
HavocFunRe
CCI-FunRe Predicate Examplethread 1 thread 2
L
thread 1 thread 2
J
…log_writer() {…return SUCCESS;}… …
log_writer() {…return SUCCESS;}…
…log_writer() {…
return SUCCESS;}…
…log_writer() {…return SUCCESS;}…
Predicate J L…
NonReentlog_writer 2 1
Reentlog_writer 0 1
…
Predicate J L…
NonReentlog_writer 2 1
Reentlog_writer 0 1
… 29
30
…log_writer() {
oldCount = atomic_inc(Count); record(“log_writer”, oldCount);
…
atomic_dec(Count); return SUCCESS;}…
CCI-FunRe Predicate Instrumentationthread 1 thread 2
…log_writer() {
oldCount = atomic_inc(Count); record(“log_writer”, oldCount);
…
atomic_dec(Count); return SUCCESS;}…
L
Predicate J L…
NonReentlog_writer 2 0
Reentlog_writer 0 0
…
FuncName Counter
… …
log_writer 0
… …
FuncName Counter
… …
log_writer 1
… …
Predicate J L…
NonReentlog_writer 2 1
Reentlog_writer 0 0
…
FuncName Counter
… …
log_writer 2
… …
Predicate J L…
NonReentlog_writer 2 1
Reentlog_writer 0 1
…
Predicate J L…
NonReentlog_writer 2 1
Reentlog_writer 0 1
…
FuncName Counter
… …
log_writer 0
… …
31
CCI-FunRe Sampling Strategy
L
thread 1 thread 2…log_writer() {
…
return SUCCESS;}…
Function execution accounting is not suitable for sampling, so this part is unconditional.
…log_writer() {
oldCount = atomic_inc(Count); record(“log_writer”, oldCount);
…
atomic_dec(Count); return SUCCESS;}…
FuncName Counter
… …
log_writer 0
… …
FuncName Counter
… …
log_writer 0
… …
FuncName Counter
… …
log_writer 0
… …
32
CCI-FunRe Sampling Strategy
• Function execution accounting:–unconditional
• FunRe predicate recording:–thread-independent–non-bursty
33
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
• Motivation• CCI Overview• CCI Predicates and Sampling Strategies – CCI-Prev and its sampling strategy – CCI-Havoc and its sampling strategy– CCI-FunRe and its sampling strategy
• Evaluation• Conclusion
Outline
34
Experimental Evaluation
• Implementation– Static instrumentor based on the CBI framework
• Real world concurrency bug failure from:– Apache HTTP server, Cherokee– Mozilla-JS, PBZIP2– SPLASH-2: FFT, LU
• Parameter used– Roughly 1/100 sampling rate
35
Failure Diagnosis Evaluation
• Methodology– Using concurrency bug failures occurred in real-world– Each app. runs 3000 times on a multi-core machine• Add random sleep to get some failure runs
– Sampling is enabled– Statistical debugging then return a list of predictors• Which predictor in the list can diagnose failure?
36
Failure Diagnosis Results (with sampling)
Program CCI-Prev CCI-Havoc CCI-FunRe
Apache-1 top1 top1 top1Apache-2 top1 top1 Cherokee top2
FFT top1 LU top1
Mozilla-JS-1 top2 top1Mozilla-JS-2 top1 top1 top1Mozilla-JS-3 top2 top1 top1
PBZIP2 top1 top1
FunRe Havoc Prev
Capability
37
Runtime OverheadPrev Havoc FunRe
No Sampling
Sampling No Sampling
Sampling No Sampling
Sampling
Apache-1 62.6% 27.4% 1.1%
Apache-2 8.4% 4.2% 0.2%
Cherokee 19.1% 2.1% 0.3%
FFT 169 % 33.5% 72.8%
LU 57857 % 1693 % 1682 %
Mozilla-JS 11311 % 7587 % 123 %
PBZIP2 0.2% 0.2% 0.3%
FunRe Havoc Prev
Overhead
Prev Havoc FunRe
No Sampling
Sampling No Sampling
Sampling No Sampling
Sampling
Apache-1 62.6% 1.9% 27.4% 2.8% 1.1% 1.8%
Apache-2 8.4% 0.5% 4.2% 0.4% 0.2% 0.2%
Cherokee 19.1% 0.3% 2.1% 0.0% 0.3% 0.4%
FFT 169 % 24.0% 33.5% 5.5% 72.8% 30.0%
LU 57857 % 949 % 1693 % 8.9% 1682 % 926 %
Mozilla-JS 11311 % 606 % 7587 % 356 % 123 % 97.0%
PBZIP2 0.2% 0.2% 0.2% 0.2% 0.3% 0.2%
38
Conclusion• CCI is capable and suitable to
diagnose many production-run concurrency bug failures.
• Future predicates can leverage our effective sampling strategies.
• Experiments confirm design tradeoff.
Capability
Ove
rhea
d
Prev
Havoc
FunRe
39
Questions about ?
Capability
Ove
rhea
d
Prev
Havoc
FunRe
CCI
40
Questions about ?
Capability
Ove
rhea
d
Prev
Havoc
FunRe
CCI
41
…memcpy(&buf[idx], s, strlen(s));
…memcpy(&buf[idx], s, strlen(s));
CBI on Concurrency Bug Failures
…return SUCCESS;
…log_writer() {
…}…
thread 1
LConcurrency Bug from Apache HTTP Server
…temp = idx;idx = temp + strlen(s);
thread 2
…return SUCCESS;
…log_writer() {
…}…
…temp = idx;idx = temp + strlen(s);
CBI does not work!
idx
To diagnose production-run concurrency bug failures, interleaving related events should be tracked!!!
42
CCI-Prev Predicate Instrumentation with Sampling
if (gsample) {
} else {
temp = cnt;
lock(glock);
changed = test_and_insert(& cnt, curTid);
record(I, changed);
temp = cnt;
unlock(glock);
[[ gsample = true; iset = curTid; lLength=gLength=0;]]?}
43
CCI-Prev Predicate Instrumentation with Sampling
if (gsample) {
} else {
temp = cnt;
lock(glock);
changed = test_and_insert(& cnt, curTid);
record(I, changed);
temp = cnt;
[[ gsample = true; iset = curTid; lLength=gLength=0;]]?
}
unlock(glock);
lLength++;
gLength++;
if (( iset == curTid && lLength > lMAX) || gLength > gMAX){ clear (); iset = unusedTid; gsample = false; }
record(stale ? P1 : P2, changed);
changed = test_and_insert(& cnt, curTid, &stale);
44
CCI-Havoc Predicate Instrumentation with Sampling
record(stale ? P1 : P2, changed);
changed = test(& cnt, cnt, &stale);
if (sample) {
} else {
temp = cnt;
temp = cnt;
[[ sample = true; length=0;]]?
}
insert (& cnt, cnt);
if (length > lMAX) { clear (); sample = false;}
length++;
No global lock used!!!
45
Failure Diagnosis Results (with sampling)
Program CBI CCI-Prev CCI-Havoc CCI-FunRe
Apache-1 top1 top1 top1Apache-2 top1 top1 Cherokee top2
FFT top1 LU top1
Mozilla-JS-1 top2 top1Mozilla-JS-2 top1 top1 top1Mozilla-JS-3 top2 top1 top1
PBZIP2 top1 top1
FunRe Havoc Prev
Capability
46
Failure diagnosis is critical