the road ahead for mining software...
TRANSCRIPT
![Page 1: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/1.jpg)
The Road Ahead for Mining Software Repositories
Ahmed E. HassanQueen’s University
CanadaCanada
![Page 2: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/2.jpg)
Code Repos
SourceforgeGoogleCode
22
Field Logs
Source ControlCVS/SVN
Bugzilla Mailinglists
CrashRepos
Historical Repositories Runtime Repos
![Page 3: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/3.jpg)
• Transforms static record-keeping repositories to activerepositories
• Makes repos data actionable
Mining Software Repositories (MSR)
• Makes repos data actionableby uncovering hidden patterns and trends
3
MailinglistBugzilla Crashes
Field logs CVS/SVN
![Page 4: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/4.jpg)
MSR researchersanalyze and cross-link repositories
fixed bug
discussionsBuggy change &
Fixing change Field crashes
Bugzilla CVS/SVNMailinglist Crashes
Estimate fix effortMark duplicates
Suggest experts and fix
New Bug Report
![Page 5: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/5.jpg)
MSR researchersanalyze and cross-link repositories
fixed bug
discussionsBuggy change &
Fixing change Field crashes
Bugzilla CVS/SVNMailinglist Crashes
Suggest APIsWarn about risky code or bugs
Suggest locations to co-change
New Change
![Page 6: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/6.jpg)
Supporting software understanding (NETBSD)
Conceptual (proposed) Concrete (reality)
6
Why? Who?When? Where?
![Page 7: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/7.jpg)
Mining supports software understanding (NETBSD)
• Eight unexpected dependencies
• All except two dependencies existed since day one:
– Virtual Address Maintenance Pager
– Pager Hardware Translations
Auto-generatedfrom CVS repository
7
Which? vm_map_entry_create (in src/sys/vm/Attic/vm_map.c) depends on pager_map (in /src/sys/uvm/uvm_pager.c)
Who? cgd
When? 1993/04/09 15:54:59 Revision 1.2 of src/sys/vm/Attic/vm_map.c
Why?
from sean eric fagan: it seems to keep the vm system from deadlocking the system when it runs out of swap + physical memory. prevents the system from giving the last page(s) to anything but the referenced "processes" (especially important is the pager process, which should never have to wait for a free page).
![Page 8: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/8.jpg)
Opportunities in the Road Ahead
Repository Extract AnalyzeAdopt Results
Show Value
• Going beyond code and bugs
• Taming the complexity of MSRTaming the complexity of MSR
• Showing the value of repositories
• Easing the adoption of MSR
![Page 9: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/9.jpg)
Opportunities in the Road Ahead
Repository Extract AnalyzeAdopt Results
Show Value
Going beyond code and bugs MSR 2004-2008:
~80% of publications focus on code and bugs
• Explore non-structured data– Social aspects: emails and comments
9
– Social aspects: emails and comments• Link data between repos• Seek non-traditional repos
– Demonstrate the value of IDE interactions or build failures repos
• Understand the limitation of repos– Causation vs. Correlation
• Small number of committers in OS projects
![Page 10: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/10.jpg)
Opportunities in the Road Ahead
Repository Extract AnalyzeAdopt Results
Show Value
• Simplify the extraction of high quality data
Taming the complexity of MSR
main() {int a;/*call
help*/helpInfo();
}
helpInfo() {errorString!
}main() {
int a;/*call
help*/h l I f ()
helpInfo(){int b;}main() {
int a;/*call
help*/h l I f ()
10
– Toolkits and extracted data (e.g. FLOSSMetrics) are needed– Heuristics should be empirically verified– Acknowledgement mechanism needed for extractors
• Deal with skew in repository data– Visualization can help spot skew– Guidelines and re-sampling/robust techniques are needed
• Improve the quality of repository data– Provide tools for annotation of repos data at creation
helpInfo();}
helpInfo();}
V1:Undefined func.(Link Error)
V2:Syntax error
V3:Valid code
![Page 11: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/11.jpg)
Opportunities in the Road Ahead
Repository Extract AnalyzeAdopt Results
Show Value
• Simplify the extraction of high quality data
Taming the complexity of MSR
11
– Toolkits and extracted data (e.g. FLOSSMetrics) are needed– Heuristics should be empirically verified– Acknowledgement mechanism needed for extractors
• Deal with skew in repository data– Visualization can help spot skew– Guidelines and re-sampling/robust techniques are needed
• Improve the quality of repository data– Provide tools for annotation of repos data at creation
![Page 12: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/12.jpg)
Opportunities in the Road Ahead
Repository Extract AnalyzeAdopt Results
Show Value
• Understand the needs of practitionersP di ti b d l
Showing the value of MSR
12
– Predicting buggy modules:• Buggy modules are well-known
– Predicting fault occurrences at module level is too coarse• Study the performance in practice
– Tools affecting the repos data• Show the practical benefits
– Statistical improvements not sufficient– Cost of maintenance should be evaluated
• Evaluate on non-open source systems
![Page 13: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/13.jpg)
Opportunities in the Road Ahead
Repository Extract AnalyzeAdopt Results
Show Value
• Simplify access to techniques
Easing the adoption of MSR
13
– Integration into IDEs (HATARI, Hipikat, Myln, eRose)– A web service demonstration for an open source
project• A continuously updating MSR Challenge
• Help practitioners make decisions– MSR should aim to support not replace
practitioners
![Page 14: The Road Ahead for Mining Software Repositoriesresearch.cs.queensu.ca/~ahmed/home/teaching/CISC880/F16/...The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen’s](https://reader033.vdocuments.site/reader033/viewer/2022060209/5f0459997e708231d40d8b83/html5/thumbnails/14.jpg)
Mining Software Repositories
14
http://msrconf.org