muvi: automatically inferring multi-variable access correlations and detecting related semantic and...

30
MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu ([email protected] ) Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou University of Illinois http://opera.cs.uiuc.edu

Upload: leanna-dunning

Post on 01-Apr-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

MUVI: Automatically InferringMulti-Variable Access Correlations andDetecting Related Semantic and Concurrency Bugs

Shan Lu ([email protected])

Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou

University of Illinoishttp://opera.cs.uiuc.edu

Page 2: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Bugs are bad!

Software bugs are costly! Account for 40% of system failures [Marcus2000] Cost US economy $59.5 billion annually [NIST]

Techniques to improve program correctness are desired

Page 3: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Software bug categories

Memory bugs Improper memory accesses and usage A lot of study and effective detection tools

Semantic bugs Violation to the design requirements or programmer intentions Biggest part (~80%*) of software bugs No silver bullet

Concurrency bugs Wrong synchronization in concurrent execution Increasingly important with the pervading concurrent program

trend Hard to detect

* Have Things Changed Now? -- An Empirical Study of Bug Characteristics in Modern Open Source Software [ACID’06]

Page 4: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

An important type of semantic information

Software programs contain many variables

Variables are NOT isolated Semantic bond exists among variables

Correct programs consistently access correlated variables

xy

z

s

t

u

v w

Variable Access Correlation

Page 5: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Variable correlation in programs

Semantic correlation widely exists among variables

struct fb_var_screeninfo

{ …

int red_msb;

int blue_msb;

int green_msb;

int transp_msb;

}Linux

Different aspects

struct net_device_stats

{

long rv_packets

long rv_bytes;

}Linux

Different representatio

n

struct st_test_file *

cur_file;

struct st_test_file *

file_stack;

MySQL

Implementation-demand

Class THD

{

char* db;

int db_length;

} MySQL

Constraint specification

M Y BD

4

Page 6: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

write ( ) write ( )

Variable access correlation ( constraint )

Maintaining correlation usually needs consistent access

db db_length

red/…/transp red/…/transp

A1 ( x ) A2 ( y )access

readwrite

accessreadwrite

rv_packets rv_bytes

file_stack cur_file

write ( ) access* ( )

write ( ) write ( )

access ( ) access ( )

Variable access correlation

*access: read or write

Page 7: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Violating the correlations leads to bugs

Programmers may forget to access correlated variables

A type of semantic bugs not handled by previous tools

Correlated variables

struct fb_var_screeninfo{ … int red_msb; int blue_msb; int green_msb; int transp_msb;}

Mostly consistent access

--- correct

Inconsistent access

--- BUG!

int imsttfb_check_var ( … ){

...var->red_msb = 0;var->green_msb = 0;var->blue_msb = 0;var->transp_msb = 0;…

}

int neofb_check_var (...){

... var->red_msb=0; var->green_msb=0; var->blue_msb=0; /* forget transp_msb!!*/

...}

Confirmed by Linux developers

Inconsistent update bugs

More examples of inconsistent update bugs

are in our paper.

Page 8: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Programmers may forget to synchronize concurrent accesses to correlated variables

This is NOT a traditional data race bug Bug occurs even if accesses to each single variable are well sy

nchronized

js_FlushPropertyCache ( … ) {

memset ( cachetable, 0, SIZE);

cacheempty = TRUE;

}

js_PropertyCacheFill ( … ) {

cachetable[indx] = obj;

cacheempty = FALSE;

}

Violating the correlations leads to bugs (ii)

Multi-variable concurrency bugs

struct JSCache {

JSEntry table[SIZE];

bool empty;

}

Thread 1 Thread 2

lock ( T )

unlock ( T )

unlock ( T )lock ( E )

unlock ( E )

Mozilla

lock ( T )

unlock ( E )

lock ( E )BUG

Page 9: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Our contribution

A technique to automatically infer variable access correlation

Bug detection based on variable access correlation Inconsistent-update semantic bugs Multi-variable concurrency bugs

Disclose correlations and new bugs from real-world applications (Linux-device_driver, Mozilla, MySQL, Httpd)

> 6000 variable correlations 39 new inconsistent-update semantic bugs 4 new multi-variable concurrency bugs from Mozilla

Page 10: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Outline

Motivation What is variable access correlation

MUVI variable access correlation inference MUVI bug detection

Inconsistent-update semantic bug detection Multi-variable concurrency bug detection

Evaluation Conclusions

Page 11: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Access correlatio

n

Basic idea of correlation inference Our target:

Our inference method:

Assumption: mature program, mostly correct x and y appear together in many times x and y seldom appear separately

Statistically infer access correlation based on variable access pattern in source code

access correlationA1 ( x ) A2 ( y )

How to judge ``together’’?

Our metric: static code distance within a function scope Our paper talks about other potential metricsHow to do this efficiently?

Page 12: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Frequent itemset mining

A common data mining technique

Itemset: a set of items ( no order ) E.g. (v, w, x, y, z)

Sub-itemset: E.g. (w, y)

Itemset database Goal: find frequent sub-itemsets

in an itemset database Support: number of appearances

E.g. support of (w, y) is 3 Frequent: support > threshold

(v, x, m, n)

(v, w, y, t )

(v, w, y, z, s )

( v, w, x, y, z )

Page 13: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Flowchart of variable correlation inference

Source files

Mining

Frequent variable sets

Itemset Database

Pre-processing

Variable access correlation

Post-processing

How?

How?

Page 14: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

MUVI Inference algorithm (pre-process)

ProgramSource Code Itemset

Database

?

What is an item? A variable

What is an itemset? A function

What to put into an itemset? Accessed variables Access type

(read/write)

Page 15: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

MUVI Inference algorithm (pre-process)

Input: program Output: an itemset database Flow-insensitive, inter-procedural analysis

Consider Global variables and structure-typed variables Also consider variables accessed in callee functions

………

{read, z}f3

{write, S::y}f2

{read, x}f1

Databaseint x;f1 ( ) {

read x;

}

f2 ( ) {

S t; write t.y;}

int z;f3 ( ) { read z; f1 ( ); f2 ( );}

{read, x}

{write, S::y}

f1 f2

f3

Page 16: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

MUVI Inference algorithm (post-process)

Input: frequent variable sets (x, y), which appear together in many functions

Pruning What if x and y appear separately many times?

Prune out low confidence (conditional probability) pairs What if x is too popular, e.g. stderr, stdout?

Categorize based on access type write (x) write (y)? Or write (x) read (y)? etc.

Output: variable correlation A1 ( x ) A2 ( y )

Page 17: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Outline

Motivation MUVI variable access correlation inference MUVI bug detection

Inconsistent-update semantic bug detection Multi-variable concurrency bug detection

Evaluation Conclusions

Page 18: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Inconsistent-update bug detection

Step 1: get all write(x)acc(y) correlations Step 2: get all violations to above

correlations Step 3: prune out unlikely bugs

Code analysis to check caller and callee functions

write (fb_var_screeninfo::blue_msb) access (fb_var_screeninfo::transp_msb)

#support = 11 #violation = 1 (function neofb_check_var)

inconsistent-update bug

int neofb_check_var (...){ ... var->red_msb=0; var->green_msb=0; var->blue_msb=0; /* forget transp_msb!!*/ ...}

Page 19: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Thread 1 Thread 2

cacheà table[indx] = obj;

cacheà empty = FALSE;

memset (cacheà table,0,SIZE) ;

cacheà empty = TRUE;A2 ( x )

Thread 1 Thread 2

A1 ( x )

Lock-Set

L (A2) ∩ L (A1) = Ф ?

Multi-variable concurrency bug detection-- MUVI Lock-set algorithm

Original algorithm Look for common locks among conflicting accesses to each

shared variable MV Lock-Set algorithm

Look for common locks among conflicting accesses to each shared variable and their correlated accesses

L (A2) ∩ L (A1) ∩ L (A3) = Ф ?

Lock-Set MVA3 ( y )

Lock ( T )

Unlock ( T )

Lock ( E )

Unlock ( E )

Lock ( T )

Unlock ( T )

Lock ( E )

Unlock ( E )

Page 20: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Multi-variable concurrency bug detection-- Other MUVI extension algorithm

MUVI happens-before algorithm Check the happens-before relation among

conflicting accesses to each single variable Check the happens-before relation among

conflicting accesses to each single variable and correlated accesses

Other extension Extending hybrid race detection Extending atomicity violation bug detection

Page 21: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Outline

Motivation MUVI variable access correlation inference MUVI bug detection

Inconsistent-update semantic bug detection Multi-variable concurrency bug detection

Evaluation Conclusions

Page 22: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Methodology

For variable correlation and inconsistent-update bug detection:

Linux (device driver) Mozilla MySQL PostgreSQL

For multi-variable concurrency bug detection:

Five existing real bugs from Mozilla and MySQL

All latest versions

Find four new multi-variable concurrency bugs during the detection process

Page 23: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Results on correlation inference

App. #Access-Correlati

on

#Involved

Variables

%False Positive

s

Analysis Time

Mozilla 1431 1380 16% 157m

MySQL 726 703 13% 19m

Linux 3353 3038 19% 175mPostgre-SQL 939 833 15% 98m

Macro, inline functionscoincidence

Page 24: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Inconsistent-update bug detection results

App. # of MUVI

bug report

# of new bugs found

# of bad programmi

ng

# of false

positives

Linux 40 22 (12) 5 13

Mozilla 30 7 (0) 8 15

MySQL 20 9 (5) 3 8Postgre-SQL 10 1 (0) 4 5Semantic

exceptionsWrong correlationsNo future read access

Page 25: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

MV-Happens-Before has similar results

Multi-variable concurrency bug detection results

Bug MV-Lockset

Detect Bug? False Positive

Moz-js1 Y 1

Moz-js2 Y 2

Moz-imap Y 0

MySQL-log Y 3

MySQL-blog N 0

Variables are conditionally correlatedThe correlation is missed by MUVI

Page 26: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Multi-variable concurrency bug detection results

4 new multi-variable concurrency bugs detected!

Thread 1 Thread 2js_NewString( … ){ // allocate a new string JS_ATOMIC_INCREMENT (&(rt->totalStrings));

PR_Lock(rtLock); rt->lengthSum += length; PR_Unlock(rtLock);} Mozilla jsstr.h Mozilla jsstr.c

printJSStringStats ( ... ) {

count = rt à totalStrings; mean = rt à lengthSum / count; printf ( …… );

}

struct JSRuntime { int totalStrings; /* # of allocated strings*/ double lengthSum; /* Total length of allocated strings */ }

Mozilla jscntxt.h

Wrong result!

Page 27: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Conclusion

Variable access correlations can be inferred

Variable access correlation is important Help detect two types of bugs Other usage

Provide specifications to ease programming Provide hints for assigning locks or TMs

E.g. AtomicSet, AutoLocker, Colorama

Page 28: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Related works

Program specification inference [ErnstICSE00], [EnglerSOSP01], [KremenekOSDI06],

[LiblitPLDI03], [WhaleyISSTA02], [YangICSE06], etc. Code pattern mining

[LiOSDI04], [LiFSE05], [LivshitsFSE05], etc. Concurrency bug detection

[ChoiPLDI02], [EnglerSOSP03], [FlanaganPOPL04], [SavageTOCS97], [Praun01], [XuPLDI05], [YuSOSP05], etc.

Techniques for easing concurrent programming [Harris03], [HerlihyISCA93], [McCloskeyPOPL06],

[Rajwar02], [Hammond04], [Moore6], [Rossbach07], etc.

Page 29: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Acknowledgement

Prof. Stefan Savage (shepherd) Anonymous reviewers Prof. Liviu Iftode GOOGLE student travel grant NSF, DOE, Intel research grants

Page 30: MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu)shanlu@cs.uiuc.edu

Thanks!

http://opera.cs.uiuc.edu