msr2012 - explaining software defects using topic models

62
Explaining Software Defects Using Topic Models Tse-Hsun (Peter) Chen, Stephen W. Thomas, Meiyappan Nagappan, Ahmed E. Hassan

Upload: queens-university-blackberry

Post on 23-Jun-2015

308 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Software Defects Using Topic Models

Tse-Hsun (Peter) Chen, Stephen W. Thomas, Meiyappan Nagappan, Ahmed E. Hassan

Page 2: MSR2012 - Explaining Software Defects Using Topic Models

2

int readFile(String filePath){ fp =

readFile(filePath)if fp == NULLreturn -1

elsereturn fp

}

Page 3: MSR2012 - Explaining Software Defects Using Topic Models

3

int readFile(String filePath){ fp =

readFile(filePath)

if fp == NULLreturn -1

elsereturn fp

}

int manageMemory(int index){

if mem[index] is not NULL{

freeInd = findFreeMemoryLoc()

goto(freeInd) }

}

Page 4: MSR2012 - Explaining Software Defects Using Topic Models

4

int readFile(String filePath){ fp =

readFile(filePath)

if fp == NULLreturn -1

elsereturn fp

}

int manageMemory(int index){

if mem[index] is not NULL{

freeInd = findFreeMemoryLoc()

goto(freeInd) }

}

More Risky Concern

Page 5: MSR2012 - Explaining Software Defects Using Topic Models

5

int readFile(String filePath){ fp =

readFile(filePath)

if fp == NULLreturn -1

elsereturn fp

}

int manageMemory(int index){

if mem[index] is not NULL{

freeInd = findFreeMemoryLoc()

goto(freeInd) }

}

More Risky Concern

Can we use concerns to study defects?

Page 6: MSR2012 - Explaining Software Defects Using Topic Models

Capturing Concerns Using Topic Models

manage memory index mem free ind find free memory loc

read file file path fp file path fp

Topics Models(LDA)

Topic 1

Topic 2

read, file, path, fp, file, index,

ind

6

manage, memory, mem, free, find, loc

Page 7: MSR2012 - Explaining Software Defects Using Topic Models

7

How defect prone are topics?

Can topics explain software defects?

Page 8: MSR2012 - Explaining Software Defects Using Topic Models

Case Studies

3 versions of each system

0.4 - 8.8 MLOC

2.8 - 17 K files

1,300 ~ 6,500 post-release defects

8

Page 9: MSR2012 - Explaining Software Defects Using Topic Models

9

How defect prone are topics?

Can topics explain software defects?

Page 10: MSR2012 - Explaining Software Defects Using Topic Models

If some topics are more defect-prone than others...

We can allocate MORE testing resources on these

topics!

10

Page 11: MSR2012 - Explaining Software Defects Using Topic Models

If some topics are more defect-prone than others...

We can allocate MORE testing resources on these

topics!

11

Page 12: MSR2012 - Explaining Software Defects Using Topic Models

12

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 13: MSR2012 - Explaining Software Defects Using Topic Models

13

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 14: MSR2012 - Explaining Software Defects Using Topic Models

14

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 15: MSR2012 - Explaining Software Defects Using Topic Models

15

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 16: MSR2012 - Explaining Software Defects Using Topic Models

16

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 17: MSR2012 - Explaining Software Defects Using Topic Models

17

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 18: MSR2012 - Explaining Software Defects Using Topic Models

18

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 19: MSR2012 - Explaining Software Defects Using Topic Models

19

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 20: MSR2012 - Explaining Software Defects Using Topic Models

20

F1

F2

F3

T1

T2

T3

T4

Measuring Topic Defect-proneness

Page 21: MSR2012 - Explaining Software Defects Using Topic Models

21

What is Relationship Between Defects and Topics?

Page 22: MSR2012 - Explaining Software Defects Using Topic Models

22

What is Relationship Between Defects and Topics?

T3 T2 T1 T4

Page 23: MSR2012 - Explaining Software Defects Using Topic Models

23

What is Relationship Between Defects and Topics?

T3 T2 T1 T4 T3 T2 T1 T4

Page 24: MSR2012 - Explaining Software Defects Using Topic Models

24

What is Relationship Between Defects and Topics?

T3 T2 T1 T4

Page 25: MSR2012 - Explaining Software Defects Using Topic Models

25

Few Topics are Defect-prone

Page 26: MSR2012 - Explaining Software Defects Using Topic Models

26

Few Topics are Defect-prone

Task, Eclipse, Eclipse Mylyn, Task ui, Core,Repository

Page 27: MSR2012 - Explaining Software Defects Using Topic Models

27

Few Topics are Defect-prone

Lower color,Jface,Comparison check

Task, Eclipse, Eclipse Mylyn, Task ui, Core,Repository

Page 28: MSR2012 - Explaining Software Defects Using Topic Models

28

How defect prone are topics?

Can topics explain software defects?

Few Topics are Defect-prone!

Page 29: MSR2012 - Explaining Software Defects Using Topic Models

29

How defect prone are topics?

Can topics explain software defects?

Few Topics are Defect-prone!

Page 30: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Defects

30

Page 31: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Defects

31

Static

Page 32: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Defects

32

Lines of CodeStatic

Page 33: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Defects

33

Lines of CodeStatic

Historical

Page 34: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Defects

34

Lines of Code

Pre-release DefectsCode Churn

Static

Historical

Page 35: MSR2012 - Explaining Software Defects Using Topic Models

Explaining Defects

35

Lines of Code

Pre-release DefectsCode Churn

Static

Historical

TopicsTopic Metrics

Page 36: MSR2012 - Explaining Software Defects Using Topic Models

36

F1

F2

F3

T1

T2

T3

T4

Using Topics to Explain Defects

Page 37: MSR2012 - Explaining Software Defects Using Topic Models

37

F1

F2

F3

T1

T2

T3

T4

Using Topics to Explain Defects

Page 38: MSR2012 - Explaining Software Defects Using Topic Models

38

F3

T1

T2

T3

T4

Using Topics to Explain Defects

F1

F2

Page 39: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

39

Static

Page 40: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

40

Static

Page 41: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

41

Deviance Explained(D1)andAIC1

Static

Page 42: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

42

Deviance Explained(D1)andAIC1

Static

Topics

Page 43: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

43

Deviance Explained(D1)andAIC1

Static

StaticTopics

Page 44: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

44

Deviance Explained(D1)andAIC1

D2 and AIC2

Static

StaticTopics

Page 45: MSR2012 - Explaining Software Defects Using Topic Models

Explainability of Metrics

45

Deviance Explained(D1)andAIC1

D2 and AIC2

Improvement in Explainability = D2 – D1 and AIC2 – AIC1

Static

StaticTopics

Page 46: MSR2012 - Explaining Software Defects Using Topic Models

More Topics More Defects in File

46

%A

vg. I m

p. in

D2

Page 47: MSR2012 - Explaining Software Defects Using Topic Models

47

F1

F2

F3

T1

T2

T3

T4

Topic Membership Metrics:Few Topics are Defect-prone

Page 48: MSR2012 - Explaining Software Defects Using Topic Models

Dealing with Large # of Metrics

48

Page 49: MSR2012 - Explaining Software Defects Using Topic Models

Dealing with Large # of Metrics

49

Topic membership metrics may have as many as

500 variables!

Page 50: MSR2012 - Explaining Software Defects Using Topic Models

Dealing with Large # of Metrics

50

Solution:Use PCA to reduce the number of metrics

Topic membership metrics may have as many as

500 variables!

Page 51: MSR2012 - Explaining Software Defects Using Topic Models

Topic Memebership Metrics Explain Defects Even More

51

% A

vg. Im

p. in

AIC

Page 52: MSR2012 - Explaining Software Defects Using Topic Models

52

How defect prone are topics?

Can topics explain software defects?

Few Topics are Defect-prone! YES!

Page 53: MSR2012 - Explaining Software Defects Using Topic Models

Limitations

53

Page 54: MSR2012 - Explaining Software Defects Using Topic Models

Limitations

54

1. Parameter Choices

Page 55: MSR2012 - Explaining Software Defects Using Topic Models

Limitations

55

1. Parameter Choices•Number of topics

•Thresholds

Page 56: MSR2012 - Explaining Software Defects Using Topic Models

Limitations

56

1. Parameter Choices•Number of topics

•Thresholds

2. Used Baseline Metrics

Static Historical

Page 57: MSR2012 - Explaining Software Defects Using Topic Models

Limitations

57

1. Parameter Choices•Number of topics

•Thresholds

2. Used Baseline Metrics

3. Studied Three Subject Systems

Static Historical

Page 58: MSR2012 - Explaining Software Defects Using Topic Models

Summary

Page 59: MSR2012 - Explaining Software Defects Using Topic Models

Summary

Page 60: MSR2012 - Explaining Software Defects Using Topic Models

Summary

Page 61: MSR2012 - Explaining Software Defects Using Topic Models

Summary

Page 62: MSR2012 - Explaining Software Defects Using Topic Models

Summary