real world concurrency bug characteristics

27
Learning from Mistakes – Real World Concurrency Bug Characteristics Yuanyuan(YY) Zhou Associate Professor University of Illinois, Urbana-Champaign ASPLOS 2008

Upload: others

Post on 29-Apr-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Real World Concurrency Bug Characteristics

Learning from Mistakes –Real World Concurrency Bug Characteristics

Yuanyuan(YY) ZhouAssociate ProfessorUniversity of Illinois, Urbana-Champaign

ASPLOS 2008

Page 2: Real World Concurrency Bug Characteristics

Paper contributions

Concurrency bug detection What types of bugs are un-solved? How bugs are diagnosed in practice?

Concurrent program testing What are the manifestation conditions for concurrency bugs?

Concurrent program language design How many mistakes can be avoided What other support do we need?

All can benefit from a closer look at real world concurrency bugs

Page 3: Real World Concurrency Bug Characteristics

105 real-world concurrency bugs from 4 large open source programs Study from 4 dimensions

Bug patterns Manifestation condition Diagnosing strategy Fixing methods

Study methodology

Page 4: Real World Concurrency Bug Characteristics

Bug report

Page 5: Real World Concurrency Bug Characteristics

8 years10 years7 years6 yearsBug DB history

640.32LOC (M line)

C++C++Mainly CC++/CLanguage

GUIClientServerServerSoftware Type

OpenOfficeMozillaApacheMySQL

Applications

Page 6: Real World Concurrency Bug Characteristics

Limitations No scientific computing applications No JAVA programs Never-enough bug samples

2

6

OpenOffice

31

74

Total

1649Deadlock

411314Non-deadlock

MozillaApacheMySQL

Bugs analyzed

Page 7: Real World Concurrency Bug Characteristics

Classified based on root causes Categories

Atomicity violation The desired atomicity of certain

code region is violated Order violation

The desired order between two (sets of) accesses is flipped

Others

X

X

Thread 1 Thread 2

Thread 1 Thread 2

Pattern

Bug patterns

Page 8: Real World Concurrency Bug Characteristics

Bug patterns: atomicity bug

Page 9: Real World Concurrency Bug Characteristics

Bug patterns: order violation bug

Page 10: Real World Concurrency Bug Characteristics

Bug patterns: underestimating #threads

Page 11: Real World Concurrency Bug Characteristics

Bug patterns: order violation bug

Page 12: Real World Concurrency Bug Characteristics

Bug patterns: order violation bug

Page 13: Real World Concurrency Bug Characteristics

We should focus on atomicity violation and order violation

Bug detection tools for order violation bugs are desired

*There are 3-bug overlap between Atomicity and Order

Implications

Atomicity Order Other0

10

20

30

40

50

OpenOfficeMozillaApacheMySQL

Bug pattern presence in analyzed softwares

Page 14: Real World Concurrency Bug Characteristics

Bug manifestation condition A specific execution order among a smallest set of memory accesses

The bug is guaranteed to manifest, as long as the condition is satisfied

How many threads are involved? How many variables are involved? How many accesses are involved?

Manifestation

Bug manifestation study

Page 15: Real World Concurrency Bug Characteristics

Single variables are more common The widely-used simplification is reasonable

Multi-variable concurrency bugs are non-negligible Techniques to detect multi-variable concurrency bugs are needed

1 Variable > 1 Variables0

20

40

60OpenOffice Mozilla Apache MySQL

49 (66%)

25 (34%)

Implications

#bugs

Bug manifestation: how many variables?

Page 16: Real World Concurrency Bug Characteristics

Bug patterns: multi-variable

Page 17: Real World Concurrency Bug Characteristics

1 acc. 2 acc. 3 acc. 4 acc. >4 acc.0

5

10

15

20

25

30

35

OpenOffice Mozilla Apache MySQL

1 acc. 2 acc. 3 acc. 4 acc. >4 acc.0

5

10

15

20

25

OpenOffice Mozilla Apache MySQL

Concurrent program testing can focus on small groups of accesses The testing target shrinks from exponential to polynomial

Non-deadlock bugs Deadlock bugs

7 (9%)1 (3%)

# of Bugs

Double lock

Bug manifestation: how many accesses?

Page 18: Real World Concurrency Bug Characteristics

101 out of 105 (96%) bugs involve at most two threads Most bugs can be reliably disclosed if we check all possible interleaving between each pair of threads

Few bugs cannot Example: Intensive resource competition among many threads causes unexpected delay

Bug manifestation: how many threads?

Page 19: Real World Concurrency Bug Characteristics

Bug patterns: underestimating #threads

Page 20: Real World Concurrency Bug Characteristics

Adding/changing locks 20 (27%) Condition check 19 (26%) Data-structure change 19 (26%) Code switch 10 (13%) Other 6 ( 8%)

No silver bullet for fixing concurrency bugs. Lock usage information is not enough to fix bugs.

ImplicationsImplications

20 (27%)

Bug fix strategies: non-deadlock bugs

Page 21: Real World Concurrency Bug Characteristics

Bug fixes: lock vs. condition variable

Page 22: Real World Concurrency Bug Characteristics

Bug fixes: two condition variables

Page 23: Real World Concurrency Bug Characteristics

Adding/changing locks 20 (27%) Condition check 19 (26%) Data-structure change 19 (26%) Code switch 10 (13%) Other 6 ( 8%)

No silver bullet for fixing concurrency bugs. Lock usage information is not enough to fix bugs.

ImplicationsImplications

20 (27%)

Bug fix strategies: non-deadlock bugs

Page 24: Real World Concurrency Bug Characteristics

Bug fixes: code switch

Page 25: Real World Concurrency Bug Characteristics

Give up resource acquisition 19 (61%) Change resource acquisition order 7 (23%) Split the resource to smaller ones 1 ( 3%) Others 4 (13%)

We need to pay attention to the correctness of ``fixed’’ deadlock bugs

Might introduce non-deadlock bugsMight introduce

non-deadlock bugs

Fix

Implications

Bug fix strategies: deadlock bugs

Page 26: Real World Concurrency Bug Characteristics

Impact of concurrency bugs ~ 70% leads to program crash or hang

Reproducing bugs are critical to diagnosis Many examples

Programmers lack diagnosis tools No automated diagnosis tools mentioned Most are diagnosed via code review Reproduce bugs are extremely hard and directly determines the diagnosing time

60% 1st-time patches still contain concurrency bugs (old or new) Usefulness and concerns of transactional memory

Summary

Page 27: Real World Concurrency Bug Characteristics

Seriousness of concurrency bugs