intensive metrics software evolution
DESCRIPTION
In natural sciences, intensive properties do not depend on the size of the system. These slides summarize how we have found intensive metrics for the case of open source software, and how to use these metrics to evaluate open source evolution. These slides have been presented at MSR 2013. There is a preprint of the paper at http://oa.upm.es/14698/TRANSCRIPT
Intensive Metrics for the Study of the Evolutionof Open Source Projects: Case Studies from the
ASF
Santiago Gala-Pérez (ASF), Gregorio Robles (URJC),Jesús M. González-Barahona (URJC), Israel Herraiz (UPM)
10th Working Conference on Mining Software RepositoriesSF, California, May 18th, 2013
Preprint available at http://oa.upm.es/14698/Slides at http://slideshare.net/herraiz/intensive-metrics-software-evolution
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 1/13
Metrics for Software Evolution
Common metrics are extensive
Difficult to compare projects of different size
Successful projects undergo large size changes over their lifetime
Intensive metrics in natural sciences
Metrics not depending on the size of system
Scale invariant
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 2/13
Metrics for Software Evolution
Common metrics are extensive
Difficult to compare projects of different size
Successful projects undergo large size changes over their lifetime
Intensive metrics in natural sciences
Metrics not depending on the size of system
Scale invariant
Are there any intensive metric for software?
Can we find intensive metrics to study software evolution?
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 2/13
The case of the Apache Software Foundation
ASF members mailing list, November 29 2008
Joe Schaeffer sayssomething IMO interesting about the ASF: the fact that the number ofcommits and the number of mailing list posts have grown in linearrelationship [...] over the years.
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 3/13
Goal of the paper
Ratio Communication flow / development activity
Hypothesis: the ratio is an intensive metric for software evolution
It varies with
Maturity, technology, community composition
But not with project source code size
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 4/13
Goal of the paper
Ratio Communication flow / development activity
Hypothesis: the ratio is an intensive metric for software evolution
It varies with
Maturity, technology, community composition
But not with project source code size
Case study: the ASF
Broad and diverse range of projects
Size, scope, technology, maturity
If it didn’t happen on-list, it didn’t happen
Communications between developers (decisions)Issue trackersCode review tools, automated builds, wiki page editsCommits
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 4/13
ASF projects under study
Project kSLOC Technology Maturity Scope
HTTPD 156 Web server Active, long-lived Users
APR 66 Library Active, long-lived Devs
Lucene 414 Index & search Active, long-lived Users
Turbine 41 Java web fwork Stagnated Devs
Tomcat 213 Servlet API Active, long-lived Devs
Jackrabbit 344 JSR-170 ref. impl. Active Devs
Hadoop 1270 Big Data Very active Devs
Geronimo 370 JavaEE app. srv. Active, long-lived Devs
SpamAssassin 54 Spam filter Mature End users
Portals 202 Web fwork Nearly dead Devs
Beehive 88 J2EE Struts Attic Devs
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 5/13
ASF projects under study
Project kSLOC Technology Maturity Scope
HTTPD 156 Web server Active, long-lived Users
APR 66 Library Active, long-lived Devs
Lucene 414 Index & search Active, long-lived Users
Turbine 41 Java web fwork Stagnated Devs
Tomcat 213 Servlet API Active, long-lived Devs
Jackrabbit 344 JSR-170 ref. impl. Active Devs
Hadoop 1270 Big Data Very active Devs
Geronimo 370 JavaEE app. srv. Active, long-lived Devs
SpamAssassin 54 Spam filter Mature End users
Portals 202 Web fwork Nearly dead Devs
Beehive 88 J2EE Struts Attic Devs
Ratio
What’s the ratio evolution for these projects?
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 5/13
Apache httpd
156 kSLOC, active and long lived web server
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 6/13
Apache Portable Runtime (APR)
66 kSLOC, active and long lived library used by httpd and Subversion
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 7/13
Apache Hadoop
1270 kSLOC, very active development and community, higher presence ofnon-human emails
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 8/13
Apache SpamAssassin
54 kSLOC, spam filter, intended for end users, maturing project
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 9/13
Apache Beehive
88 kSLOC, project in the Attic (no longer under development)
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 10/13
Overall comparison
Allows for comparison of projects with large differences in size, scope,technology, maturity
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 11/13
Overall comparison
Lessons learned
Healthy Apache projects have smooth ratios
Projects with little activity, or small core group, are noisier
Peaks to infinity are evidence of stagnation
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 12/13
Overall comparison
Lessons learned
Healthy Apache projects have smooth ratios
Projects with little activity, or small core group, are noisier
Peaks to infinity are evidence of stagnation
User-oriented projects
Evolution:
Starts with high values
Stabilize and matures with 3 <ratio< 8
Developer-oriented projects
Evolution:
Smaller community, no peaks
Always within 3 <ratio< 8
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 12/13
Conclusions and further work
Metric
Intensive and expressive metric.Not depending on size, maturity,scope or technology.
End-users
More suitable for users-orientedprojects. Ratio works better withlarge and active communities.
Stagnation
Can identify stagnated projects.Can signal potential stagnationthreats.
Other ratios, other cases
Devel-only messages, issues,commits complexity.Study beyond the ASF.
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 13/13
Conclusions and further work
Metric
Intensive and expressive metric.Not depending on size, maturity,scope or technology.
End-users
More suitable for users-orientedprojects. Ratio works better withlarge and active communities.
Stagnation
Can identify stagnated projects.Can signal potential stagnationthreats.
Other ratios, other cases
Devel-only messages, issues,commits complexity.Study beyond the ASF.
Get a preprint of the paper at http://oa.upm.es/14698
Replication packagehttp://gsyc.es/∼grex/repro/2013-apache-intensive/
, Intensive metrics for open source evolution – http://oa.upm.es/14698/ 13/13