analyzing networks of issue reports

37
Analyzing networks of issue reports| Borg, Pfahl, and Runeson Analyzing Networks of Issue Reports Markus Borg Dietmar Pfahl Per Runeson

Upload: markus-borg

Post on 15-Apr-2017

335 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Analyzing networks of issue reports

Analyzing Networks of Issue Reports

Markus BorgDietmar Pfahl Per Runeson

Page 2: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Markus Borg

Dietmar Pfahl Per RunesonUniversity of Tartu

EstoniaLund University

Sweden

• Third year PhD student• MSc CS and engineering• Software developer (2007-2010)• Empirical research group

Page 3: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Agenda

• Background and Context– Information management– Safety-critical development– Impact analysis

• Goal and method of this study• Results• Future work

Page 4: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Background and Context

Page 5: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Information management

• Large projects, much information

Page 6: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Challenges

• A state of information overload– Engineers cannot process all

information– Causes stress– Obstructs decision making

• Poor findability– More effort to navigate information

landscape

Page 7: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Intensified in safety development

• Safety standards mandate documentation

Railroad Nuclear Process Machinery Automotive Industry

Page 8: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Mandated documents in IEC 26262

Page 9: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Work task: Impact Analysis (IA)

• Required by IEC 61508 before changes to production code• Studied an industrial case

– Documented– Reviewed during safety audits

RequirementsTests

Page 10: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Work task: Impact Analysis (2)

• Formal template• Impact on code and non-code

specified as traceability links• Manual work

IMS

Page 11: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Supporting the impact analysis?Work task

?

Reqs. DB Code Repo Test DB

Page 12: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Reuse knowledge from previous IAs

Page 13: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Information in networks, so what?

• Search in hyperlinked structures well researched– Also applied in software engineering (Karabatis et al. (2009))

HITS algorithm

Page et al. (1999)

Kleinberg (1999)

Social network analysis

Page 14: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Networks of issue reports

• What type of networks can we find in issue databases?

?

Page 15: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Method

Page 16: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Issue databases under study

• Safety IMS (2000-2012)– Industrial control system– Mandated by strict processes– Issues submitted by engineers

• Android IMS (2007-2012)– OS for handheld devices– Open source software– Issues submitted to public database

Page 17: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Link mining in the issue databases

• Safety IMS– ”Related cases” field in database

• Android IMS– No separate field for linking issues– Communication using comments

(100,000+)– Developers refer to other issues,

stored as HTML hyperlinks• Extracted using regular expressions

Page 18: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Results

Page 19: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Extracted network - Overview

• Safety IMS– 26,120 issue reports– 18,000 links– 15,000 components– 13,000 isolated issue reports

Page 20: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Extracted network – Close-up

Page 21: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Extracted network - Overview

• Android IMS– 20,176 issue reports– 3,500 links– 18,000 components– 17,000 isolated issue reports

Page 22: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Example of sub-network

Bug starOne central issue report pointing at several others

Caused by duplicates

Page 23: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Example of sub-network

Dense ringMost issue reports are connected.

Caused by copy-paste comments

Page 24: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Extracted networks

What do developers signal by creating HTML hyperlinks?

Page 25: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Link semantics

• Indicate relationships with different certainty– Related issue report (possibly probably definetely)– Duplicate issue report (possibly probably definetely)– Cloned issue report

• Misc. links– Raising awareness of issue reports– Release planning– Links with the wrong target

• Links appear to carry meaning

Page 26: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

More recent results

Page 27: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Contents of IA reports in the Safety IMS

Code

HW description

Misc. documents

Test caseUser manual

Test case

Page 28: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Mining IA reports in the Safety IMS

• ~ 5,000 impact analysis reports

Node types• Issue reports• Requirements• Test specifications• Hardware descriptions

Link types• Related issue• Specified by• Verified by• Needs update• Impacted HW

Page 29: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Extracted semantic network

• 27,958 nodes– ~26,000 issue reports– ~3,000 other artifacts

• 28,230 links– ~18,000 related issue– ~4,000 specified by– ~2,300 verified by

Page 30: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Extracted semantic network – Circle layout

Page 31: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Future work

Page 32: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

How can the networks be exploited?

Page 33: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Neighbourhood search

Application 1:Search for connected artifacts

Page 34: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Centrality measures

Application 2: Identification of key artifacts (ranking)

Page 35: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Goal: Impact Recommender

1. Identify similar issues2. Identify neighbours3. Rank candidates

Far awayTextual

sim.High cent.

Page 36: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Summary

• Link mining in IMSs can discover complex issue networks – The process-heavy IMS contains more links– Links among issue reports, created in comments by Android

developers, typically signal relations

• Networks of issue reports can be extended by other artifacts

• Networked information enables better navigation­ Broaden search (following links)­ Sharpen search (better ranking)

Page 37: Analyzing networks of issue reports

Analyzing networks of issue reports| Borg, Pfahl, and Runeson

Thanks!

?