mining software data to help developers: challenges … · {janvik, lyndonh, murphy}@cs.ubc.ca...
TRANSCRIPT
![Page 1: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/1.jpg)
MINING SOFTWARE DATA TO HELP DEVELOPERS:
CHALLENGES AND PERSPECTIVES
Massimiliano Di PentaUniversity of Sannio, Italy
[email protected], @maxdipenta
![Page 2: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/2.jpg)
FAQ when people met me for the first time at a
conference
UNIVERSITY OF... WHAT?
![Page 3: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/3.jpg)
UNIVERSITY OF... WHAT?
![Page 4: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/4.jpg)
UNIVERSITY OF... WHAT?
![Page 5: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/5.jpg)
UNIVERSITY OF... WHAT?
![Page 6: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/6.jpg)
UNIVERSITY OF... WHAT?
![Page 7: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/7.jpg)
UNIVERSITY OF... WHAT?
![Page 8: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/8.jpg)
![Page 9: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/9.jpg)
![Page 10: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/10.jpg)
![Page 11: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/11.jpg)
MAIN RESEARCH INTERESTS
Software analytics
Empirical software engineering
Software testing
Software evolution
Continuous Integration
![Page 12: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/12.jpg)
TALK OUTLINEIntroduction to mining software repositories
Recommender systems
Challenges
Key ingredients, summary, takeaways
![Page 13: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/13.jpg)
INTRODUCTION TO MINING SOFTWARE
REPOSITORIES
![Page 14: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/14.jpg)
data extraction
Emails
MINING SOFTWARE REPOSITORIES
Versioning
Issue Reports
Software History
change propagation
evolution visualization
change patterns
software complexity
fault prediction
effort estimation
Knowledge inference Classification
Association rules Clustering
![Page 15: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/15.jpg)
HISTORICAL ANALYSIS
Static analysis
Dynamic analysis
![Page 16: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/16.jpg)
HISTORICAL ANALYSIS
Static analysis
Dynamic analysis
Product analysis
![Page 17: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/17.jpg)
HISTORICAL ANALYSIS
Static analysis
Dynamic analysis
Historical analysis
Product analysis
Analyze trails left by developers during their maintenance activities
![Page 18: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/18.jpg)
HISTORICAL ANALYSIS
Static analysis
Dynamic analysis
Historical analysis
Product analysis
Analyze trails left by developers during their maintenance activities
Process analysis
(learning from history)
![Page 19: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/19.jpg)
HISTORICAL ANALYSISStatic and dynamic analysis do not capture information such as:
How does an artifact change during the time?
When was it changed?
Who changed it?
Why was it changed?
What artifacts changed together?
![Page 20: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/20.jpg)
OK, SO WHAT?
![Page 21: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/21.jpg)
RECOMMENDER SYSTEMS FOR SOFTWARE ENGINEERS
![Page 22: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/22.jpg)
MORE SERIOUSLY…
“A software application that provide informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 23: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/23.jpg)
EXAMPLES
![Page 24: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/24.jpg)
CHANGE IMPACT ANALYSIS
Thomas Zimmermann, Peter Weißgerber, Stephan Diehl, Andreas Zeller : Mining Version Histories to Guide Software Changes. ICSE 2004: 563-572
Mining Version Histories to Guide Software Changes
Thomas [email protected]
Peter Weiß[email protected]
Stephan [email protected]
Andreas [email protected]
Saarland University, Saarbrucken, Germany
Abstract
We apply data mining to version histories in order toguide programmers along related changes: “Programmerswho changed these functions also changed. . . ”. Given aset of existing changes, such rules (a) suggest and predictlikely further changes, (b) show up item coupling that is in-detectable by program analysis, and (c) prevent errors dueto incomplete changes. After an initial change, our ROSEprototype can correctly predict 26% of further files to bechanged—and 15% of the precise functions or variables.The topmost three suggestions contain a correct locationwith a likelihood of 64%.
1. Introduction
Shopping for a book at Amazon.com, you may have comeacross a section that reads “Customers who bought thisbook also bought. . . ”, listing other books that were typi-cally included in the same purchase. Such information isgathered by data mining— the automated extraction of hid-den predictive information from large data sets. In this pa-per, we apply data mining to version histories: “Program-mers who changed these functions also changed. . . ”. Justlike the Amazon.com feature helps the customer browsingalong related items, our ROSE tool guides the programmeralong related changes, with the following aims:
Suggest and predict likely changes. Suppose you are aprogrammer and just made a change. What else doyou have to change? Figure 1 on the following pageshows our ROSE tool as a plug-in for the ECLIPSEprogramming environment. The programmer is ex-tending ECLIPSE itself with a new preference, and hasadded an element to the fKeys[] array. ROSE now sug-gests to consider further changes, as inferred from theECLIPSE version history. First come the locations withthe highest confidence—that is, the likelihood that fur-ther changes be applied to the given location.
Prevent errors due to incomplete changes. In Figure 1,the top location has a confidence of 1.0: In the past,
each time some programmer extended the fKeys[] ar-ray, she also extended the function that sets the pref-erence default values. If the programmer now wantedto commit her changes without altering the suggestedlocation, ROSE would issue a warning.
Detect coupling indetectable by program analysis. AsROSE operates uniquely on the version history, it isable to detect coupling between items that cannot bedetected by program analysis—including couplingbetween items that are not even programs. In Figure 1,position 3 on the list is an ECLIPSE HTML documen-tation file with a confidence of 0.75—suggesting thatafter adding the new preference, the documentationshould be updated, too.
ROSE is not the first tool to leverage version histories. Inearlier work (Section 7), researchers have used history datato understand programs and their evolution [3], to detectevolutionary coupling between files [8] or classes [4], or tosupport navigation in the source code [6]. In contrast to thisstate of the art, the present work
• uses full-fledged data mining techniques to obtain as-sociation rules from version histories,
• detects coupling between fine-grained program enti-ties such as functions or variables (rather than, say,classes), thus increasing precision and integrating withprogram analysis,
• thoroughly evaluates the ability to predict future ormissing changes, thus evaluating the actual usefulnessof our techniques.
The remainder of this paper is organized as follows. Sec-tion 2 shows how to gather changes and their effects; Sec-tion 3 applies this to CVS. Section 4 describes the basicapproaches to mining these data, followed by examples inSection 5. In Section 6, we evaluate ROSE’s ability to pre-dict future changes, based on earlier history: How often canROSE suggest further changes, and, if so, how precise is it?Section 7 discusses related work and Section 8 closes withconclusion and consequences.
Proceedings of the 26th International Conference on Software Engineering (ICSE’04)
0270-5257/04 $20.00 © 2004 IEEE
![Page 25: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/25.jpg)
CHANGE IMPACT ANALYSIS
![Page 26: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/26.jpg)
CHANGE IMPACT ANALYSIS
![Page 27: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/27.jpg)
CHANGE IMPACT ANALYSIS
![Page 28: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/28.jpg)
HOW DOES IT WORK?
![Page 29: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/29.jpg)
HOW DOES IT WORK?
![Page 30: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/30.jpg)
DEFECT PREDICTION
![Page 31: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/31.jpg)
DEFECT PREDICTION
![Page 32: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/32.jpg)
DEFECT PREDICTION
![Page 33: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/33.jpg)
DEFECT PREDICTION
![Page 34: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/34.jpg)
DEFECT PREDICTION
![Page 35: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/35.jpg)
MORE DURING THE TUTORIAL….
![Page 36: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/36.jpg)
BUG TRIAGINGWho Should Fix This Bug?
John Anvik, Lyndon Hiew and Gail C. MurphyDepartment of Computer Science
University of British Columbia{janvik, lyndonh, murphy}@cs.ubc.ca
ABSTRACTOpen source development projects typically support an openbug repository to which both developers and users can re-port bugs. The reports that appear in this repository mustbe triaged to determine if the report is one which requiresattention and if it is, which developer will be assigned theresponsibility of resolving the report. Large open source de-velopments are burdened by the rate at which new bug re-ports appear in the bug repository. In this paper, we presenta semi-automated approach intended to ease one part of thisprocess, the assignment of reports to a developer. Our ap-proach applies a machine learning algorithm to the open bugrepository to learn the kinds of reports each developer re-solves. When a new report arrives, the classifier producedby the machine learning technique suggests a small numberof developers suitable to resolve the report. With this ap-proach, we have reached precision levels of 57% and 64% onthe Eclipse and Firefox development projects respectively.We have also applied our approach to the gcc open source de-velopment with less positive results. We describe the condi-tions under which the approach is applicable and also reporton the lessons we learned about applying machine learningto repositories used in open source development.
Categories and Subject Descriptors: D.2 [Software]:Software Engineering
General Terms: Management.
Keywords: Problem tracking, issue tracking, bug reportassignment, bug triage, machine learning
1. INTRODUCTIONMost open source software developments incorporate an
open bug repository that allows both developers and users topost problems encountered with the software, suggest possi-ble enhancements, and comment upon existing bug reports.One potential advantage of an open bug repository is that itmay allow more bugs to be identified and solved, improvingthe quality of the software produced [12].
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ICSE’06,May 20–28, 2006, Shanghai, China.Copyright 2006 ACM 1-59593-085-X/06/0005 ...$5.00.
However, this potential advantage also comes with a sig-nificant cost. Each bug that is reported must be triagedto determine if it describes a meaningful new problem orenhancement, and if it does, it must be assigned to an ap-propriate developer for further handling [13]. Consider thecase of the Eclipse open source project1 over a four monthperiod (January 1, 2005 to April 30, 2005) when 3426 re-ports were filed, averaging 29 reports per day. Assumingthat a triager takes approximately five minutes to read andhandle each report, two person-hours per day is being spenton this activity. If all of these reports led to improvementsin the code, this might be an acceptable cost to the project.However, since many of the reports are duplicates of exist-ing reports or are not valid reports, much of this work doesnot improve the product. For instance, of the 3426 reportsfor Eclipse, 1190 (36%) were marked either as invalid, a du-plicate, a bug that could not be replicated, or one that willnot be fixed.
As a means of reducing the time spent triaging, we presentan approach for semi-automating one part of the process, theassignment of a developer to a newly received report. Ourapproach uses a machine learning algorithm to recommendto a triager a set of developers who may be appropriatefor resolving the bug. This information can help the triageprocess in two ways: it may allow a triager to process abug more quickly, and it may allow triagers with less overallknowledge of the system to perform bug assignments morecorrectly. Our approach requires a project to have had anopen bug repository for some period of time from which thepatterns of who solves what kinds of bugs can be learned.Our approach also requires the specification of heuristics tointerpret how a project uses the bug repository. We believethat neither of these requirements are arduous for the largeprojects we are targeting with this approach. Using our ap-proach we have been able to correctly suggest appropriatedevelopers to whom to assign a bug with a precision between57% and 64% for the Eclipse and Firefox2 bug repositories,which we used to develop the approach. We have also ap-plied our approach to the gcc repository, but the resultswere not as encouraging, hovering around 6% precision. Webelieve this is in part due to a prolific bug-fixing developerwho skews the learning process.
The paper makes two contributions:
1Eclipse provides an extensible development environment,including a Java IDE, and can be found at www.eclipse.org(verified 31/08/05).2Firefox provides a web browser and can be found at www.
mozilla.org/products/firefox/ (verified 07/09/05).
361
John Anvik, Lyndon Hiew, Gail C. Murphy: Who should fix this bug? ICSE 2006: 361-370
![Page 37: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/37.jpg)
BUG TRIAGING
![Page 38: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/38.jpg)
BUG TRIAGING
![Page 39: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/39.jpg)
MAIN IDEA
Assign a bug to available developers who previously fixed similar bugs
![Page 40: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/40.jpg)
AUTOMATED GENERATION OF RELEASE NOTES
1
ARENA: An Approach for the AutomatedGeneration of Release Notes
Laura Moreno, Member, IEEE, Gabriele Bavota, Member, IEEE, Massimiliano Di Penta, Member, IEEE,Rocco Oliveto, Member, IEEE, Andrian Marcus, Member, IEEE, Gerardo Canfora
Abstract—Release notes document corrections, enhancements, and, in general, changes that were implemented in a new release ofa software project. They are usually created manually and may include hundreds of different items, such as descriptions of newfeatures, bug fixes, structural changes, new or deprecated APIs, and changes to software licenses. Thus, producing them can be atime-consuming and daunting task. This paper describes ARENA (Automatic RElease Notes generAtor), an approach for theautomatic generation of release notes. ARENA extracts changes from the source code, summarizes them, and integrates them withinformation from versioning systems and issue trackers. ARENA was designed based on the manual analysis of 990 existing releasenotes. In order to evaluate the quality of the release notes automatically generated by ARENA, we performed four empirical studiesinvolving a total of 56 participants (48 professional developers and 8 students). The obtained results indicate that the generatedrelease notes are very good approximations of the ones manually produced by developers and often include important information thatis missing in the manually created release notes.
Index Terms—Release notes, Software documentation, Software evolution
F
1 INTRODUCTION
RELEASE notes summarize the main changes that oc-curred in a software system since its previous release,
such as, the addition of new features, bug fixes, changes tolicenses under which the project is released, and, especiallywhen the software is a library used by others, relevantchanges at code level. Different stakeholders might benefitfrom release notes. Developers, for example, use them tolearn what has changed in the code, which features havebeen implemented and which features are left for futurereleases, which bugs have been fixed and which ones arestill open, whether or not the latest release introduced newlegal constraints, etc. Similarly, integrators, who are using alibrary in their code, use the library release notes to decidewhether or not such a library should be upgraded to a newrelease. Finally, end users read the release notes to decidewhether it would be worthwhile to install a new release ofa software (e.g., a mobile app).
In general, release notes are produced manually. Ac-cording to a survey we conducted among open-source andprofessional developers (see Section 4.2), creating a releasenote by hand is a difficult and effort-prone activity that cantake as much as eight hours. Some issue trackers partiallysupport this task by generating simplified release notes
• L. Moreno and A. Marcus are with the University of Texas at Dallas,Richardson, TX, USA.E-mail: lmorenoc, [email protected]
• G. Bavota is with the Free University of Bozen-Bolzano, Bolzano, Italy.E-mail: [email protected]
• M. Di Penta and G. Canfora are with the University of Sannio, Benevento,Italy.E-mail: dipenta, [email protected]
• R. Oliveto is with the University of Molise, Pesche (IS), Italy.E-mail: [email protected]
Manuscript received April 19, 2005; revised September 17, 2014.
(e.g., the Atlassian OnDemand release note generator1), yet suchnotes are limited to list closed issues that developers havemanually associated with a release.
This paper proposes ARENA (Automatic RElease NotesgenerAtor), an automated approach for the generation ofrelease notes. ARENA identifies changes occurred in thecommits performed between two releases of a softwareproject, such as, structural changes to the code, upgradesof external libraries used by the project, and changes inthe licenses. Then, ARENA summarizes the code changesthrough an approach derived from code summarization[28], [38]. These changes are linked to information thatARENA extracts from commit notes and issue trackers,which is used to describe fixed bugs, new features, and openbugs related to the previous release. Finally, the release noteis organized into categories and presented as a hierarchicaland interactive HTML document, where details on eachitem can be expanded or collapsed, as needed.
Currently, there is no industry-wide standard on the con-tent and format of release notes, hence software companieshave adopted independent release note guidelines based ontheir own information and technical needs. For this reason,ARENA has been designed based on the manual analysisof 990 project release notes identifying what elements theytypically contain. It is also important to point out that therelease notes generated by ARENA include information thatis useful mainly to developers and integrators, rather than toend-users.
From an engineering standpoint, ARENA leverages ex-isting approaches for code summarization and for linkingcode changes to issues; yet, ARENA is novel and uniquefor two reasons: (i) it generates summaries and descriptions
1. http://tinyurl.com/atlassian-jira-rn. All URLs last verified on06/10/2015.
Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus, Gerardo Canfora:Automatic generation of release notes. SIGSOFT FSE 2014: 484-495Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus, Gerardo Canfora:ARENA: An Approach for the Automated Generation of Release Notes. IEEE Transactions on Software Engineering
![Page 41: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/41.jpg)
ARENA (AUTOMATIC RELEASE NOTE GENERATOR)
Change Extractor
Issue Tracker
Versioning System
Select bundles for releases rk-1 and rk
bundle rk-1 bundle rk
Code Change Summarizer
Commits-Issues Linker
Issue Extractor
Information Aggregator
HTML
ReleaseNote
LegendInformation FlowDependency
fine-grained structural changes linked to commits
changes to libraries
changes to documentation
changes to licenses
summarized structural changes linked to commits
summarized structural changes linked to issues
![Page 42: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/42.jpg)
EXAMPLE OF GENERATED RELEASE NOTE
![Page 43: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/43.jpg)
RECOMMENDING CODE EXAMPLES
How Can I Use This Method?Laura Moreno∗, Gabriele Bavota†, Massimiliano Di Penta‡, Rocco Oliveto§ and Andrian Marcus∗
∗The University of Texas at Dallas, USA; †Free University of Bozen-Bolzano, Italy;‡University of Sannio, Italy; §University of Molise, Italy
Abstract—Code examples are small source code fragmentswhose purpose is to illustrate how a programming language con-struct, an API, or a specific function/method works. Since codeexamples are not always available in the software documentation,researchers have proposed techniques to automatically extractthem from existing software or to mine them from developerdiscussions. In this paper we propose MUSE (Method USageExamples), an approach for mining and ranking actual codeexamples that show how to use a specific method. MUSE combinesstatic slicing (to simplify examples) with clone detection (to groupsimilar examples), and uses heuristics to select and rank thebest examples in terms of reusability, understandability, andpopularity. MUSE has been empirically evaluated using examplesmined from six libraries, by performing three studies involving atotal of 140 developers to: (i) evaluate the selection and rankingheuristics, (ii) provide their perception on the usefulness of theselected examples, and (iii) perform specific programming tasksusing the MUSE examples. The results indicate that MUSE selectsand ranks examples close to how humans do, most of the codeexamples (82%) are perceived as useful, and they actually helpwhen performing programming tasks.
I. INTRODUCTION
Developers frequently need to use methods they are not fa-miliar with or they do not remember how to use. To learn aboutsuch methods, developers usually resort to reference manuals(e.g., Javadoc), Questions and Answers (Q&A) forums (e.g.,Stack Overflow), or other information sources. Often, theseresources provide little more than generic explanations ofthe method usage syntax, or focus on the method’s technicaldetails. In such cases, developers could benefit from short codefragments, i.e., code examples, presenting actual, practical usesof the method. The goal of our work is to automaticallygenerate such code examples, given a specific method.
Our work adds to and complements existing approaches thathave been developed to obtain code examples of one sort oranother. Some of them focus on matching a query onto anexisting code base with the aim of identifying generic codeexamples [1], [2], [3], [4]. Other approaches try to matchuser queries or developer’s code context onto discussions(containing examples) in Q&A forums [5]. Recently, a fewapproaches have focused on providing abstract code examplesfor a given API, either as-a-whole [6] or focusing on specificmethods [7], [8], [9]. Abstract code examples are fragments ofcode showing usage patterns of a code element (e.g., an API orone of its methods). Even when there are several ways of usinga code element, an abstract example presents a synthesizedand generic use of a code element. Conversely, concretecode examples are working fragments of code showing actual,existing usage scenarios of a code element.
Our conjecture is that providing developers with several
concrete method usages would augment abstract code exam-ples and result in better understanding of the method usage.For this reason, we focus on the still open problem of miningrelevant concrete code examples for a given method. Specif-ically, we aim at answering the following question: “Given aspecific method needed to perform a task, what are the neces-sary steps to use it?” For instance, once a developer has un-derstood the purpose of an API and has gained an idea of whatthe various methods do (e.g., through a reference manual), shewants to know what are the typical invocation scenarios fora given method, say copyInputStreamToFile. To thisaim, she needs to find one or more examples that have thenecessary steps to invoke this method, such as, invoking othermethods of the API or manipulating the method’s parameters.Such a method usage example (see Fig. 1) shows that in orderto use the desired method (line 12) two arguments are required(e.g., zip.getInputStream(entry) and file). Theinline comments (lines 8-11) provide information about eachargument: the former is an InputStream instance whosecontent will be copied, and the latter is a non-directory Fileinstance where the content will be written to. The commentsalso inform that neither of them should be null. In addition, theusage example shows how all elements involved in the invoca-tion are obtained (lines 1-7). For example, the InputStreamargument is obtained from a ZipFile object instantiated inline 3 and given a ZipEntry object instantiated in line 6.
We propose MUSE (Method USage Examples), an ap-proach that automatically finds, extracts, and documents con-crete usage examples of a given method (like the one inFig. 1). MUSE employs state-of-the-art static analysis tech-niques, which ensure its precision, usability, and set it apartfrom existing work. It parses existing applications to collectmethod usages and computes a static backward slice for eachusage statement to detect the sequence of relevant steps toinvoke the method and prune out code that is less relevant.Since different applications might use the same method in asimilar way, MUSE identifies similar examples through clonedetection. The detected clones are used as a measure of
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
Fig. 1. Example generated by MUSE.
2015 IEEE/ACM 37th IEEE International Conference on Software Engineering
978-1-4799-1934-5/15 $31.00 © 2015 IEEE
DOI 10.1109/ICSE.2015.98
880
2015 IEEE/ACM 37th IEEE International Conference on Software Engineering
978-1-4799-1934-5/15 $31.00 © 2015 IEEE
DOI 10.1109/ICSE.2015.98
880
2015 IEEE/ACM 37th IEEE International Conference on Software Engineering
978-1-4799-1934-5/15 $31.00 © 2015 IEEE
DOI 10.1109/ICSE.2015.98
880
2015 IEEE/ACM 37th IEEE International Conference on Software Engineering
978-1-4799-1934-5/15 $31.00 © 2015 IEEE
DOI 10.1109/ICSE.2015.98
880
Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus: How Can I Use This Method? ICSE (1) 2015: 880-890
![Page 44: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/44.jpg)
ZipFile::getInputStream(…)
![Page 45: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/45.jpg)
EXAMPLE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
![Page 46: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/46.jpg)
MUSE (METHOD USAGE EXAMPLES)
Clients Downloader
Provide the project to document and its
clients
projectsource codeand javadoc
Examples Extractor
HTMLproject
javadoc with examples
LegendInformation FlowDependency
Clients
Examples Evaluator
JDeodorantslicer
Simianclone detector
ReadabilityBuse et al.
Examples Injector
Examples
RankedExamples
client projects list
![Page 47: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/47.jpg)
RECOMMENDING RELEVANT STACKOVERFLOW DISCUSSIONS
Mining StackOverflow to Turn the IDE into aSelf-Confident Programming Prompter
Luca Ponzanelli1, Gabriele Bavota2, Massimiliano Di Penta2, Rocco Oliveto3, Michele Lanza1
1: REVEAL @ Faculty of Informatics – University of Lugano, Switzerland2: University of Sannio, Benevento, Italy 3: University of Molise, Pesche (IS), Italy
ABSTRACTDevelopers often require knowledge beyond the one they possess,which often boils down to consulting sources of information likeApplication Programming Interfaces (API) documentation, forums,Q&A websites, etc. Knowing what to search for and how is non-trivial, and developers spend time and energy to formulate theirproblems as queries and to peruse and process the results.
We propose a novel approach that, given a context in the IDE,automatically retrieves pertinent discussions from Stack Overflow,evaluates their relevance, and, if a given confidence threshold issurpassed, notifies the developer about the available help. Wehave implemented our approach in Prompter, an Eclipse plug-in.Prompter has been evaluated through two studies. The first wasaimed at evaluating the devised ranking model, while the secondwas conducted to evaluate the usefulness of Prompter.
Categories and Subject DescriptorsD.2.6 [Software Engineering]: Programming Environments—In-teractive environments
General TermsDocumentation, Experimentation
KeywordsRecommender Systems, Developers Support, Empirical Studies
1. INTRODUCTIONThe myth of the lonely programmer is still lingering, in stark
contrast with reality: Software development, also due to the everincreasing complexity of modern systems, is tackled by collaboratingteams of people. A helping hand is often required, either by teammates [18], through pair programming sessions [6], or the perusalof vast amounts of knowledge available on the Internet [41].
While asking team mates is the preferred means to obtain help[20], their availability may fall short. In this case, developers resortto electronically available information. This comes with a number of
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.MSR ’14, May 31 – June 1, 2014, Hyderabad, IndiaCopyright 2014 ACM 978-1-4503-2863-0/14/05 ...$15.00.
problems, the main one being the absence of automation: Every timedevelopers need to look for information, they interrupt their workflow, leave the IDE, and use a Web browser to perform and refinesearches, and assess the results. Finally, they transfer the obtainedknowledge to the problem context in the IDE. The information isretrieved from di↵erent sources, such as forums, mailing lists [2],blogs, Q&A websites, bug trackers [1], etc. A prominent example isStack Overflow, popular among developers as a venue for sharingprogramming knowledge. Stack Overflow is vast: In 2010 it alreadyhad 300k users, and millions of questions, answers, and comments[23]. This makes finding the right piece of information cumbersomeand challenging.
Recommender systems [33] represent a possible solution to thisproblem. A recommender system gathers and analyzes data, iden-tifies useful artifacts, and suggests them to the developer. Seminaltools, such as eRose [43], Hipikat [9] and DeepIntellisense [14],suggest project artifacts in the IDE aiming at providing developerswith additional information on specific parts of the system. Theycome however with a caveat: the developer must proactively invokethem, and, once invoked, they continuously display information,which may defeat their purpose, as they augment the complexity ofwhat is displayed in the IDE. Ideally, a recommender system shouldbehave like a prompter in a theatre: Ready to provide suggestionswhenever the actor needs them, and ready to autonomously givesuggestions if it feels something is going wrong.
The interaction between the theatre prompter and the actor issimilar to the interaction between two developers doing pair pro-gramming, working side by side to write code. These developershave di↵erent roles, i.e., the driver, who is in charge of writingcode, and the observer, who observes the work of the driver [42],tries to understand the context, and, if she has enough confidence,interrupts the driver by giving suggestions. In addition, the drivercan consult the observer whenever she needs it, making the observerthe programming prompter of the programming actor.
This interaction is what we propose in Prompter, a tool that auto-matically retrieves and recommends, with push notifications, rele-vant Stack Overflow discussions to the developer. Prompter makesthe IDE a programming prompter that silently observes and analyzesthe code context in the IDE, automatically searches for Stack Over-flow discussions on the Web, evaluates their relevance by taking intoconsideration code aspects (e.g., code clones, type matching), con-ceptual aspects (e.g., textual similarity), and Stack Overflow commu-nity aspects (e.g., user reputation) to decide, given a certain amountof self-confidence (encoded in a threshold the user can changethrough a slider, to make the recommender quiet or talkative) whento suggest discussions. We have evaluated Prompter through twostudies, one aimed at evaluating its ranking model, the other aimedat evaluating its usefulness to developers.
1
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].
MSR’14, May 31 – June 1, 2014, Hyderabad, IndiaCopyright 2014 ACM 978-1-4503-2863-0/14/05...$15.00http://dx.doi.org/10.1145/2597073.2597077
102
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Michele Lanza:Prompter - Turning the IDE into a self-confident programming assistant. Empirical Software Engineering 21(5): 2190-2231 (2016)Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Michele Lanza:Mining StackOverflow to turn the IDE into a self-confident programming prompter. MSR 2014: 102-111
![Page 48: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/48.jpg)
MOTIVATIONS
![Page 49: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/49.jpg)
MOTIVATIONS
![Page 50: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/50.jpg)
MOTIVATIONS
![Page 51: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/51.jpg)
MOTIVATIONS
![Page 52: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/52.jpg)
MOTIVATIONS
![Page 53: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/53.jpg)
TOOL (PROMPTER)
1
2
![Page 54: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/54.jpg)
APPROACH
Search Service
EclipsePrompter
Query Generation Service
Search Engines
Bing
Blekko
Stack Overflow API Service
RankingModel
Search EngineProxy
Code Context
1 32
CodeContext
Query &Triggering
Info
Query &Code Context
4 Query
5 Results
6Discussion IDs
7 Documents
8Ranked Results
![Page 55: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/55.jpg)
THESE SOUND LIKE VERY PROMISING DIRECTIONS…
![Page 56: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/56.jpg)
…BUT TRAPS ARE AROUND THE CORNER!
![Page 57: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/57.jpg)
CHALLENGE I
![Page 58: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/58.jpg)
BACK TO THE DEFINITION…
“A software application that provides informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 59: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/59.jpg)
BACK TO THE DEFINITION…
“A software application that provides informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 60: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/60.jpg)
CAN WE FULLY RELY ON SOFTWARE REPOSITORY DATA?
![Page 61: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/61.jpg)
CHALLENGE I:NOISY AND
INCOMPLETE DATA
![Page 62: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/62.jpg)
THE PROBLEM
Many pieces of information filled by humans → imprecise and incomplete
![Page 63: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/63.jpg)
MISSING LINKS
nmbd_incomingdgrams.c: Fix bug with Syntax 5.1 servers reported by SGI where they do host announcements to LOCAL_MASTER_BROWSER_NAME<00 rather than WORKGROUP<1d
Quieten level 0 debug when probing for modules. We shouldn't display so loud an error when a smb_probe_module() fails. Also tidy up debugs a bit. Bug 375.
Adrian Bachmann, Christian Bird, Foyzur Rahman, Premkumar T. Devanbu, Abraham Bernstein: The missing links: bugs and bug-fix commits. SIGSOFT FSE 2010: 97-106
![Page 64: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/64.jpg)
SECRET LIFE OF BUGSSoftware repositories do not capture everything of a software project
Not all discussions, not all decisions, and after all also not all changes
Jorge Aranda, Gina Venolia: The secret life of bugs: Going past the errors and omissions in software repositories. ICSE 2009: 298-308
![Page 65: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/65.jpg)
INCORRECT CLASSIFICATION
• Issue tracking systems contain various kinds of changes
• Classified using inadequate fields, or just poorly and subjectively classified
![Page 66: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/66.jpg)
RESULTS OF A MANUAL CLASSIFICATION
We manually classified 1,800 randomly selected bugs from Mozilla, Eclipse, JBoss
0
150
300
450
600
Mozilla Eclipse JBoss
156
24
121
99
382
209
345194270
Bugs Non bugsOthers
Giuliano Antoniol, Kamel Ayari, Massimiliano Di Penta, Foutse Khomh, Yann-Gaël Guéhéneuc: Is it a bug or an enhancement?: a text-based approach to classify change requests. CASCON 2008: 23
![Page 67: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/67.jpg)
It’s not a Bug, it’s a Feature:How Misclassification Impacts Bug Prediction
Kim HerzigSaarland University
Saarbrucken, [email protected]
Sascha JustSaarland University
Saarbrucken, [email protected]
Andreas ZellerSaarland University
Saarbrucken, [email protected]
Abstract—In a manual examination of more than 7,000 issuereports from the bug databases of five open-source projects,we found 33.8% of all bug reports to be misclassified—thatis, rather than referring to a code fix, they resulted in a newfeature, an update to documentation, or an internal refactoring.This misclassification introduces bias in bug prediction models,confusing bugs and features: On average, 39% of files markedas defective actually never had a bug. We estimate the impact ofthis misclassification on earlier studies and recommend manualdata validation for future studies.
Index Terms—mining software repositories; bug reports; dataquality; noise; bias
I. INTRODUCTION
In empirical software engineering, it has become common-place to mine data from change and bug databases to detectwhere bugs have occurred in the past, or to predict where theywill occur in the future. The accuracy of such measurementsand predictions depends on the quality of the data. Therefore,mining software archives must take appropriate steps to assuredata quality.
A general challenge in mining is to separate bugs fromnon-bugs. In a bug database, the majority of issue reportsare classified as bugs—that is, requests for corrective codemaintenance. However, an issue report may refer to “perfectiveand adaptive maintenance, refactoring, discussions, requestsfor help, and so on” [1]—that is, activities that are unrelatedto errors in the code, and would therefore be classified in anon-bug category. If one wants to mine code history to locateor predict error prone code regions, one would therefore onlyconsider issue reports classified as bugs. Such filtering needsnothing more than a simple database query.
However, all this assumes that the category of the issuereport is accurate. In 2008, Antoniol et al. [1] raised theproblem of misclassified issue reports—that is, reports clas-sified as bugs, but actually referring to non-bug issues. Ifsuch mix-ups (which mostly stem from issue reporters anddevelopers interpreting “bug” differently) occured frequentlyand systematically they would introduce bias in data miningmodels threatening the external validity of any study thatbuilds on such data: Predicting the most error-prone files, forinstance, may actually yield files most prone to new features.But how often does such misclassification occur? And does itactually bias analysis and prediction?
TABLE IPROJECT DETAILS.
Maintainer Tracker type # reports
HTTPClient APACHE Jira 746Jackrabbit APACHE Jira 2,402Lucene-Java APACHE Jira 2,443Rhino MOZILLA Bugzilla 1,226Tomcat5 APACHE Bugzilla 584
These are the questions we address in this paper. Fromfive open source projects (Section II), we manually classifiedmore than 7,000 issue reports into a fixed set of issue reportcategories clearly distinguishing the kind of maintenance workrequired to resolve the task (Section III). Our findings indicatesubstantial data quality issues:Issue report classifications are unreliable. In the five bug
databases investigated, more than 40% of issue reportsare inaccurately classified (Section IV)
Every third bug is not a bug. 33.8% of all bug reports donot refer to corrective code maintenance (Section V).
After discussing the possible sources of these misclassifica-tions (Section VI), we turn to the consequences. We find thatthe validity of studies regarding the distribution and predictionof bugs in code is threatened:Files are wrongly marked to be error-prone. Due to mis-
classifications, 39% of files marked as defective actuallyhave never had a bug (Section VII).
Files are wrongly predicted to be error-prone. Between16% and 40% of the top 10% most defect-prone filesdo not belong in this category after reclassification(Section VIII).
Section IX details studies affected and unaffected by theseissues. After discussing related work (Section X) and threatsto validity (Section XI), we close with conclusion and conse-quences (Section XII).
II. STUDY SUBJECTS
We conducted our study on five open-source JAVA projectsdescribed in Table I. We aimed to select projects that wereunder active development and were developed by teams thatfollow strict commit and bug fixing procedures similar to in-dustry. We also aimed to have a more or less homogenous data
Kim Herzig, Sascha Just, Andreas Zeller : It's not a bug, it's a feature: how misclassification impacts bug prediction. ICSE 2013: 392-401
![Page 68: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/68.jpg)
IT’S NOT A BUG…
Confirmed our results: 1/3 of bug reports are not about bugs
When predicting the top 10% defect-prone files, 16% to 40% do not belong to that category
![Page 69: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/69.jpg)
HINTS
Do not trust data from software repositories
Validate, validate, validate, validate!
Use heuristics to prevent problems
![Page 70: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/70.jpg)
CHALLENGE II
![Page 71: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/71.jpg)
No question, when you develop a recommender, you need to evaluate it
![Page 72: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/72.jpg)
TYPICAL QUESTIONS YOU ASK
![Page 73: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/73.jpg)
HOW ACCURATE IS IT?
![Page 74: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/74.jpg)
HOW FAST IS IT?
![Page 75: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/75.jpg)
IS IT ANY BETTER THAN COMPETITOR’S TOOL?
![Page 76: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/76.jpg)
WHAT ARE WE MISSING HERE?
![Page 77: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/77.jpg)
BACK TO THE DEFINITION…
“A software application that provides informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 78: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/78.jpg)
BACK TO THE DEFINITION…
“A software application that provides informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 79: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/79.jpg)
IS THE TOOL GOING TO HELP A DEVELOPER
FOR A GIVEN TASK?
![Page 80: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/80.jpg)
CHALLENGE II: EVALUATION
![Page 81: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/81.jpg)
DIFFERENT KINDS OF EVALUATIONS
Surveys
Controlled Experiments
Case Studies
![Page 82: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/82.jpg)
SURVEY
Retrospective (post mortem), e.g. about a technology/tool being adopted for a period of time
![Page 83: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/83.jpg)
CONTROLLED EXPERIMENT
Study performed in a laboratory setting, with a high degree of control
![Page 84: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/84.jpg)
CASE STUDY
Aims at monitoring a (real) project in a realistic environment
![Page 85: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/85.jpg)
SCALE VS. RISK
Risk
Scale
Survey
Experiment
CaseStudy
[Linkerman and Rombach, 2000]
![Page 86: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/86.jpg)
QUANTITATIVE VS QUALITATIVE STUDIES
Quantitative: to get numerical relations among variables
Qualitative: to interpret a phenomenon just observing it in its context
![Page 87: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/87.jpg)
EXAMPLES
![Page 88: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/88.jpg)
ARENA
PROMPTER
1
2
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
![Page 89: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/89.jpg)
PROMPTER
1
2
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
ARENA
![Page 90: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/90.jpg)
PROMPTER
1
2
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
ARENA
![Page 91: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/91.jpg)
ARENA
Study I: Completeness of release notes
Study II: Importance of different pieces
Study III: Comparison with human generated release notes
Study IV: Field Study
![Page 92: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/92.jpg)
EVALUATION I: COMPLETENESS
Design: We asked study participants to compare the generated release note with the original one
Study participants: 10 among students, faculties, industrial developers
Objects: 10 releases of open source projects
![Page 93: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/93.jpg)
EVALUATION I: RESULTS
Apache Cayenne 3.0
Apache Cayenne 3.1
Apache Commons Codec
Apache Lucene
Jackson-Core 2.1.0
Jackson-Core 2.1.3
Janino 2.5
Janino 2.6
% of Answers
0 25 50 75 100
Absent Less Same More
![Page 94: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/94.jpg)
EVALUATION II: IMPORTANCEOnline survey, in which we showed to participant release notes (they didn’t know whether these were generated or not)
We asked participants of a online survey to evaluate:
• What they consider important in a release note and often
• What they considered important in both ARENA release notes and actual release notes
![Page 95: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/95.jpg)
WHAT DO DEVELOPERS INCLUDE IN RELEASE NOTES?
●●● ●●
●
●● ●
Fixedbugs
Newfeatures
Enhancedfeatures
New codecomp.
Modifiedcode comp.
Deprecatedcode comp.
Deletedcode comp.
Replacedcode comp.
Changes vis. code comp.
Changes toconf. files
Changes totest suites
Refactoringoperations
Architecturalchanges
Changes todocum.
Changes tolicenses
LibrariesUpgrades
KnownIssues
1.0
2.0
3.0
4.0
What kind of content do you include in release notes?
Often
Sometimes
Rarely
Never
●●● ●●
●
●● ●
Fixedbugs
Newfeatures
Enhancedfeatures
New codecomp.
Modifiedcode comp.
Deprecatedcode comp.
Deletedcode comp.
Replacedcode comp.
Changes vis. code comp.
Changes toconf. files
Changes totest suites
Refactoringoperations
Architecturalchanges
Changes todocum.
Changes tolicenses
LibrariesUpgrades
KnownIssues
1.0
2.0
3.0
4.0
What kind of content do you include in release notes?
Often
Sometimes
Rarely
Never
●●● ●●
●
●● ●
Fixedbugs
Newfeatures
Enhancedfeatures
New codecomp.
Modifiedcode comp.
Deprecatedcode comp.
Deletedcode comp.
Replacedcode comp.
Changes vis. code comp.
Changes toconf. files
Changes totest suites
Refactoringoperations
Architecturalchanges
Changes todocum.
Changes tolicenses
LibrariesUpgrades
KnownIssues
1.0
2.0
3.0
4.0
What kind of content do you include in release notes?
Often
Sometimes
Rarely
Never
![Page 96: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/96.jpg)
IMPORTANCE OF RELEASE NOTE CONTENT
![Page 97: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/97.jpg)
ARENA
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
PROMPTER
1
2
![Page 98: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/98.jpg)
ARENA
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
PROMPTER
1
2
![Page 99: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/99.jpg)
PROMPTER
First evaluation: assess the correctness of the provided recommendations
Second evaluation: ask developers to perform a task with and without Prompter
![Page 100: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/100.jpg)
ARE RECOMMENDATIONS RELEVANT?
12
34
5
5 12 17 29 32 10 11 25 1 2 7 9 22 30 8 21 24 4 6 35 26 31 33 34 36 13 19 20 23 27 28 14 18 3 37 15 16
Questions
Rat
ing
UI & Graphics Concurrency Basic Data Manipulation Compression File System Networking System Info Language Additional APIs
![Page 101: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/101.jpg)
EXPERIMENT DESIGN
Group I Group 2 Group 3 Group 4
Lab I Task 1Prompter
Task IJust Web
Task 1IJust Web
Task IIPrompter
Lab II Task IIJust Web
Task 1IPrompter
Task 1Prompter
Task IJust Web
![Page 102: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/102.jpg)
DOES IT HELP?
NP P
100
90
80
70
60
50
40
30
20
10
0Overall Maintenance Task (MT) Development Task (DT)
NP P NP P
Com
plet
enes
s
![Page 103: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/103.jpg)
ARENA
PROMPTER
1
2
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
![Page 104: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/104.jpg)
ARENA
PROMPTER
1
2
MUSE
01 File source;02 File target;03 ZipFile zip=new ZipFile(source);04 Enumeration<? extends ZipEntry> entries=zip.entries();05 while(entries.hasMoreElements()) {06 ZipEntry entry=entries.nextElement();07 File file=new File(target,entry.getName());08 //zip.getInputStream(entry)->the InputStream to copy bytes from, 09 //must not be null10 //file->the non-directory File to write bytes to (possibly 11 //overwriting), must not be null12 FileUtils.copyInputStreamToFile(zip.getInputStream(entry),file);13 }
![Page 105: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/105.jpg)
MUSE
Study I: (intrinsic) manual evaluation of produced examples
Study II: (extrinsic) controlled experiment
![Page 106: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/106.jpg)
INTRINSIC EVALUATION
Research question: Are MUSE’s usage examples considered useful by developers?
Context:
• 60 code examples from 6 Apache libraries
• 119 developers recruited among those who developed the libraries and from open source projects using such libraries
![Page 107: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/107.jpg)
EXAMPLES OF RESULTS
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
commons-io
●●
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
commons-lang
●
●
● ● ●
●
● ●Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
http-client
● ●
●
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
poi
●
●
●
● ●
●
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
tika
● ● ●
●●
●Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
xerces
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
commons-io
●●
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
commons-lang
●
●
● ● ●
●
● ●Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
http-client
● ●
●
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
poi
●
●
●
● ●
●
Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
tika
● ● ●
●●
●Not usefulat all
Slightlyuseful
Useful
Veryuseful
1 2 3 4 5 6 7 8 9 10
xerces
![Page 108: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/108.jpg)
EXTRINSIC EVALUATIONResearch Question: Do MUSE’s examples help developers to complete their programming tasks?
Participants: 12 Industrial developers
Study design: similar to PrompterGroup I Group 2 Group 3 Group 4
Lab I Task 1Without Examples
Task IWith Examples
Task 1IWithout Examples
Task IIWith Examples
Lab II Task IIWith Examples
Task 1IWithout Examples
Task 1With Examples
Task IWithout Examples
![Page 109: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/109.jpg)
RESULTS
Difference between CE and NCE statistically significant (p- value=0.03) with a medium effect size (d=0.472)
●
●●
●
●
●
OverallNCE
0%
50%
100%
75%
25%
OverallCE
T1NCE
T1CE
T2NCE
T2CE
![Page 110: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/110.jpg)
CHALLENGE III
![Page 111: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/111.jpg)
SO FAR…
![Page 112: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/112.jpg)
SO FAR…
Our approach is very accurate
![Page 113: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/113.jpg)
SO FAR…
Our approach is very accurate
It is very fast
![Page 114: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/114.jpg)
SO FAR…
Our approach is very accurate
It is very fast
It is better than other tools
![Page 115: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/115.jpg)
SO FAR…
Our approach is very accurate
It is very fast
It is better than other tools
It HELPS developers
![Page 116: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/116.jpg)
WHAT ARE WE MISSING?
![Page 117: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/117.jpg)
BACK TO THE DEFINITION…
“A software application that provides informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 118: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/118.jpg)
BACK TO THE DEFINITION…
“A software application that provides informationitems estimated to be valuable for a software engineering task in a given context”
Martin P. Robillard, Robert J. Walker, Thomas Zimmermann:Recommendation Systems for Software Engineering. IEEE Software 27(4): 80-86 (2010)
![Page 119: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/119.jpg)
CHALLENGE III:PRACTICAL APPLICABILITY
![Page 120: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/120.jpg)
INFORMED INTERVIEWS
Easy solution
1. Let developers play with the tool
2. Then, interview them…
![Page 121: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/121.jpg)
IN-FIELD CASE STUDY
1. Let developers use your tool in a real task
2. Observe…
3. Interview…
![Page 122: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/122.jpg)
ARENA: 6-MONTHS IN-FIELD STUDY
We let developers of a E-health project to use ARENA for 6 months
![Page 123: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/123.jpg)
FEEDBACK
“The possibility to expand and retract the details about any of the different items allows the reader of the release note to control in some way the level of redundancy she wants.”
![Page 124: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/124.jpg)
FEEDBACK
“ARENA helps to save time, especially when you need to create release notes and you cannot really allocate too much time on such a task.”
![Page 125: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/125.jpg)
SUMMARY
![Page 126: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/126.jpg)
![Page 127: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/127.jpg)
data extraction
Emails
MINING SOFTWARE REPOSITORIES
Versioning
Issue Reports
Software History
change propagation
evolution visualization
change patterns
software complexity
fault prediction
effort estimation
Knowledge inference Classification
Association rules Clustering
![Page 128: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/128.jpg)
data extraction
Emails
MINING SOFTWARE REPOSITORIES
Versioning
Issue Reports
Software History
change propagation
evolution visualization
change patterns
software complexity
fault prediction
effort estimation
Knowledge inference Classification
Association rules Clustering
CHALLENGE I:NOISY AND
INCOMPLETE DATA
![Page 129: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/129.jpg)
data extraction
Emails
MINING SOFTWARE REPOSITORIES
Versioning
Issue Reports
Software History
change propagation
evolution visualization
change patterns
software complexity
fault prediction
effort estimation
Knowledge inference Classification
Association rules Clustering
CHALLENGE II: EVALUATION
CHALLENGE I:NOISY AND
INCOMPLETE DATA
![Page 130: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/130.jpg)
data extraction
Emails
MINING SOFTWARE REPOSITORIES
Versioning
Issue Reports
Software History
change propagation
evolution visualization
change patterns
software complexity
fault prediction
effort estimation
Knowledge inference Classification
Association rules Clustering
CHALLENGE II: EVALUATION
CHALLENGE I:NOISY AND
INCOMPLETE DATA
CHALLENGE III:PRACTICAL APPLICABILITY
![Page 131: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/131.jpg)
TAKEAWAYS
![Page 132: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/132.jpg)
![Page 133: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/133.jpg)
Don’t trust data you’re using
![Page 134: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/134.jpg)
Don’t trust data you’re using
Never get tired about manual validation
![Page 135: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/135.jpg)
Don’t trust data you’re using
Never get tired about manual validation
One study is not enough → multiple studies give you different perspectives
![Page 136: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/136.jpg)
Don’t trust data you’re using
Never get tired about manual validation
One study is not enough → multiple studies give you different perspectives
Humans need to be involved in tool evaluation
![Page 137: MINING SOFTWARE DATA TO HELP DEVELOPERS: CHALLENGES … · {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Open source development projects typically support an open bug repository to](https://reader035.vdocuments.site/reader035/viewer/2022071008/5fc56e01281576426931ec1c/html5/thumbnails/137.jpg)
Don’t trust data you’re using
Never get tired about manual validation
One study is not enough → multiple studies give you different perspectives
Humans need to be involved in tool evaluation
Ask questions: [email protected] @mdipenta