creating and analyzing source code repository models - a model-based approach to mining software...

43
Model-based Analysis of Source Code Repositories Markus Scheidgen 1 Martin Schmidt Joachim Fischer {scheidge,schmidma,fi[email protected] Humboldt Universität zu Berlin

Upload: markus-scheidgen

Post on 11-Apr-2017

95 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Analysis of Source Code Repositories

Markus Scheidgen

1

Martin Schmidt Joachim Fischer

{scheidge,schmidma,[email protected] Universität zu Berlin

Page 2: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Agenda

▶ Software Evolution, Reverse Engineering, and Mining Software Repositories

▶ Model-based Mining of Software Repositories

▶ srcrepo a framework for model-bases analysis of software repositories

▶ Experiments with Eclipse’s software repositories

▶ Conclusions

2

Page 3: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

software maintenance

RHEAD…R0

■ quality assessment■ implicit dependencies

software engineering

Software Evolution

3

requirements

M

M {C}

S

userproblem

M

M {C}

S

user

mining software repositories1

software modernization

OMG’s ADM(Achitecture-Driven Modernization)‣AST Meta-Model (ASTM)‣Knowledge Discovery Meta-

Model (KDM)‣Software Metrics Meta-Model

(SMM)

M

SS

M

M

M

{C} {C}

reve

rse

engi

neer

ing forw

ard engineering

transformation

1. H. Kagdi, M.L. Collard, J.I. Maletic: A survey and taxonomy of approaches for mining software repositories in the context of software evolution; Journal of Software Maintenance and Evolution: Research and Practice; Vol.19/Nr.2/2007

Page 4: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Mining Software Repositories – In General

▶ The term mining software repositories (MSR) has been coined to describe a broad class of investigations into the examination of software repositories.

▶ The premise of MSR is that empirical and systematic investigations of repositories will shed new light on the process of software evolution. [1]

▶ Different scopes, e.g. single software projects vs. many software projects

▶ Different goals, e.g. quality assessments and implicit dependencies vs. generalizations about software evolution

4

1. H. Kagdi, M.L. Collard, J.I. Maletic: A survey and taxonomy of approaches for mining software repositories in the context of software evolution; Journal of Software Maintenance and Evolution: Research and Practice; Vol.19/Nr.2/2007

Page 5: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

5

MS M{C} reverse engineering

1. E.J. Chikofsky, J.H. Cross: Reverse engineering and design recovery: A taxonomy; IEEE Software; Vol.7/Nr.1/1990

2.R. Lincke, J. Lundberg, W. Löwe: Comparing Software Metrics Tools; 8th International Symp. on Software Testing and Analysis; 2008

Page 6: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

5

MS M{C}

MM{Cn}RHEAD…

R0

reverse engineering

1. E.J. Chikofsky, J.H. Cross: Reverse engineering and design recovery: A taxonomy; IEEE Software; Vol.7/Nr.1/1990

2.R. Lincke, J. Lundberg, W. Löwe: Comparing Software Metrics Tools; 8th International Symp. on Software Testing and Analysis; 2008

Page 7: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

5

MS M{C}

MM{Cn}RHEAD…

R0{Cn-1} MM

reverse engineering

1. E.J. Chikofsky, J.H. Cross: Reverse engineering and design recovery: A taxonomy; IEEE Software; Vol.7/Nr.1/1990

2.R. Lincke, J. Lundberg, W. Löwe: Comparing Software Metrics Tools; 8th International Symp. on Software Testing and Analysis; 2008

Page 8: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

5

MS M{C}

MM{Cn}RHEAD…

R0{Cn-1} MM

{C0}

MM… … …

reverse engineering

1. E.J. Chikofsky, J.H. Cross: Reverse engineering and design recovery: A taxonomy; IEEE Software; Vol.7/Nr.1/1990

2.R. Lincke, J. Lundberg, W. Löwe: Comparing Software Metrics Tools; 8th International Symp. on Software Testing and Analysis; 2008

Page 9: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

5

MS M{C}

MM{Cn}RHEAD…

R0{Cn-1} MM

{C0}

MM… … …

reverse engineering

1. E.J. Chikofsky, J.H. Cross: Reverse engineering and design recovery: A taxonomy; IEEE Software; Vol.7/Nr.1/1990

2.R. Lincke, J. Lundberg, W. Löwe: Comparing Software Metrics Tools; 8th International Symp. on Software Testing and Analysis; 2008

Page 10: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

▶ Scope

■ depends on concreter MSR-application and its goals

■ number of software projects: single repositories, large repositories, ultra-large repositories

■ Sources as text and text based metrics, e.g. LOC

■ Declarations only: packages, classes, methods, but no statements, expressions, etc.

■ Full AST with or without cross-references

6

Page 11: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

▶ Scope

■ depends on concreter MSR-application and its goals

■ number of software projects: single repositories, large repositories, ultra-large repositories

■ Sources as text and text based metrics, e.g. LOC

■ Declarations only: packages, classes, methods, but no statements, expressions, etc.

■ Full AST with or without cross-references

7

Page 12: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based Mining of Software Repositories

▶ MSR tools are already “model-based”, but in a proprietary manner

▶ Idea: existing reverse engineering framework and corresponding standard meta-models and modeling frameworks instead of proprietary solutions

▶ Goals

■ deal with heterogeneity (different version control systems, different languages)

■ reuse of existing meta-models, transformations, and languages

■ interoperability with existing analysis tools

■ retaining meaningful scalability8

Page 13: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based MSR Strategies

9

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

Page 14: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

Page 15: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

Page 16: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

Page 17: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d)

Page 18: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

Page 19: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

Page 20: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

Page 21: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M2

Page 22: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Page 23: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

9

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

X

R

Checkout+

X

CUs

Parse+Analysis

!

�X

R

Checkout+

X

�CUs

(Parse+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Analysis

0)

Page 24: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

10

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Page 25: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

10

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Page 26: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

10

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Page 27: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

10

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

X

R

Checkout+

X

CUs

Parse+Analysis

!

�X

R

Checkout+

X

�CUs

(Parse+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Analysis

0)

Page 28: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

11

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Page 29: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

11

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Load(r)Save(r)

Page 30: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

11

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Load(r)Save(r)

X

R

Checkout+

X

CUs

Parse+Analysis

!

�X

R

Checkout+

X

�CUs

(Parse+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Analysis

0)

Page 31: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

12

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Load(r)Save(r)

X

R

Checkout+

X

CUs

Parse+Analysis

!

�X

R

Checkout+

X

�CUs

(Parse+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Analysis

0)

Analysis0(r)

Page 32: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

12

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Load(r)Save(r)

Analysis0(r)

X

R

Checkout+

X

CUs

Parse+Analysis

!

�X

R

Checkout+

X

�CUs

(Parse+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Merge) +Analysis

!

�X

R

X

�CUs

(Load+Analysis

0)

Page 33: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

snapshot

snapshot

A2 B3

Model-based MSR Strategies

13

snapshot

A1 B1

Checkout(r)

vers

ion

cont

rol s

yste

m

A1-A2

A1 B1

B1-B3

snapshot

A2 B1

snapshot

X

d2CUs(r)

Parse(d) snapshot

M1

Analysis(r)

M3

M2

Merge(r)

f

B.f

fB.f

Parse(d)X

d2�CUs(r)

Load(r)Save(r)

Analysis0(r)

Page 34: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Research Questions

▶ Assumptions

■ Development of MSR-applications based on models, transformation languages and standardized meta-models is favorable

■ Some MSR-applications need to analyze source code on a deep (AST) level

■ MSR-analysis is performed iteratively

▶ Hypotheses

■ Models of source code repositories can be created and persisted

■ Traversing existing persistent models of source code repositories is much faster than traversing transient models that are created from version control system on the fly

14

Page 35: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

srcrepo – A Framework for Model-based MSR

▶ Eclipse’s MoDisco as reverse engineering framework

■ reverse engineering for Java, based on EMF

■ Support for many JRE-ased languages: Java, xText, JSP, XML

■ creates instances of a Java EMF meta-model that corresponds to the handwritten JDT AST-model

■ provides transformation to language independent artifacts, e.g. KDM

▶ EMF-Fragments1 to store very large-models

■ uses No-SQL databases and stores larger model fragments within database entries

■ in contrast to object-by-object stores such as ORM-based CDO or No-SQL-based Morsa or Neo4J

▶ Xtend programming with higher order functions to mimic OCL-style definition of software metrics2

15

1. M.Scheidgen, A.Zubow,J.Fischer,T.H.Kolbe: Automated and Transparent Model Fragmentation for Persisting Large Models; ACM/IEEE 15th International Conference on Model Driven Engineering Languages & Systems (MODELS); Innsbruck; 2012

2. M.Scheidgen, J.Fischer: Model-based Mining of Software Repositories; 8th Systems and Modeling Conference, Valencia, Spain, September 29th, 2014

Page 36: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

“OCL” to Calculate Metrics of AST-Models

16

// Weighted number of methods per class.def wmc(AbstractTypeDeclaration type,(Block)⇒int weight) { type.bodyDeclarations.sum[if (body != null) weight.apply(it.body) else 1]}

Page 37: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Experiments

▶ Eclipse Foundation sources, i.e. Eclipse platform and plug-ins (large scale software repository)

▶ Organized in different (couple hundred) projects: jdt, cdt, emf, ...

▶ Available via GITHub

▶ GIT repositories can be gathered automated via GITHub’s REST-ful API

▶ 200 largest Eclipse repositories that actually contained Java code: 6.6 GB Git, 400 MLOC, 250 GB model with 4 billion objects.

17

Page 38: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Example Plot: Halstead-length for each Revision; Eclipse CDT

18

2004 2006 2008 2010 2012 2014 2016

02

46

81

0

time (years)

WM

C w

ith

Ha

lste

ad

len

gth

(x 1

06)

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++ +++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++

+++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++ ++++++++++++++++++++

++++++++++++++++++++ ++++++++++++++++++ ++

+++++++++++++++++

+

++++++++ +++++++++

++++++++++++++++++ +

+++++++

++++ ++ + +++++ +++++++++++++++++++++++++++++++++++++++

Page 39: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model Create v Analysis Times

19

jdt.u

i

xte

xt

ecl

ipse

link

jdt.c

ore

sw

t

cdt

ocl

ptp

org

.asp

ectj

cdo

udfmerge/incrementloadsaveparsecheckout

time

(hou

rs)

0

2

4

6

8

10

12

14

checkout parse save load merge/increment

udf

020

040

060

080

0

avg

time

per r

evis

ion

(ms)

Page 40: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

jdt.c

ore

cdt

jdt.u

i

cdo

em

f.em

fsto

re.c

ore

em

f

jdt.d

ebug

em

f.com

pare

em

f.tex

o

em

f.diff

mer

ge.c

ore

GIT sizeModel size

GIT repository vs model sizein

GB

0

5

10

15

Diskspace

20

Page 41: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Delta-Compression1

21

cdt

web

tool

s

gm

f−to

olin

g

rap

jdt.c

ore

delta-modelsinitial revisionsuncompressed

Named element matching

GB

0

2

4

6

8

10

12

14

cdt

web

tool

s

gm

f−to

olin

g

rap

jdt.c

ore

detla-modelsinitial revisionsuncompressed

Meta-class matching

GB

0

2

4

6

8

10

12

14

cdt

web

tool

s

gm

f−to

olin

g

rap

jdt.c

ore

delta-linesinitial revisionsuncompressed

Line matching

MLi

nes

0

20

40

60

80

100

initialrevisions

deltamodels formeta-class

deltalines

delta modelsfor named elements

020

4060

80

Compressed relative to full size

(%)

parse compressnamed

elements

compressmeta-class

decompr.meta-class

decompr.named

elements

parse compressnamed

elements

compressmeta-class

decompr.meta-class

decompr.named

elements

050

010

0015

00

Avg. execution times

avg.

tim

e pe

r rev

isio

n (m

s)

1050

200

1000

Avg. execution times (logarithmic)

avg.

tim

e pe

r rev

isio

n (m

s)

1. M.Scheidgen: Evaluation of Model Comparison for Delta-Compression in Model Persistence; BigMDE 2016 (at STAF 2016),Vienna, Austria, July 6-7, 2016

Page 42: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Model Creation v Analysis with Delta-Compression

22

jdt.u

i

xte

xt

ecl

ipse

link

jdt.c

ore

sw

t

cdt

ocl

ptp

org

.asp

ectj

cdo

udfmerge/incrementloadsavecompressparsecheckout

time

(hou

rs)

0

2

4

6

8

10

12

14

jdt.u

i

xte

xt

ecl

ipse

link

jdt.c

ore

sw

t

cdt

ocl

ptp

org

.asp

ectj

cdo

udfmerge/incrementloadsaveparsecheckout

time

(hou

rs)

0

2

4

6

8

10

12

14

without compression with compression

Page 43: Creating and Analyzing Source Code Repository Models - A Model-based Approach to Mining Software Repositories

Conclusions

▶ MSR can support software evolution and helps to understand software evolution

▶ Traversing a source code repository to gather information (MSR) is very time consuming, especially with iterative analysis

▶ It is possible to save most of this time via saving data in its model state, at the cost of comparably large models that need to be persisted

▶ The MSR analysis execution time savings are considerable

23