assessing the use of eclipse mde technologies in open-source software projects

18
1 Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects D. Kolovos 1 , N. Matragkas 2 , I. Korkontzelos 3 , S. Ananiadou 3 , and R. Paige 1 1 University of York, 2 University of Hull 3 University of Manchester

Upload: dimitris-kolovos

Post on 16-Apr-2017

1.528 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

1

Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

D. Kolovos1, N. Matragkas2, I. Korkontzelos3, S. Ananiadou3, and R. Paige1

1University of York, 2University of Hull

3University of Manchester

Page 2: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

2

Page 3: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

3 Questions

•  Which of these technologies demonstrate significant adoption in open-source projects?

•  Is the activity related to these technologies increasing or decreasing over time?

•  What is the size of the community of GitHub developers who are familiar with these technologies?

Page 4: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

4 Methodology

•  Identify concrete technologies of interest •  Compose GitHub queries based on file

extensions and keywords •  Filter out irrelevant results – e.g file extension clashes

•  Retrieve all commits involving files of interest

•  Unify data and analyse

Page 5: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

5 Technologies

Eclipse MDE Technologies on GitHub 3

– Henshin (http://www.eclipse.org/henshin/): EMF-based graph transformation lan-guage.

Projects that were initially considered but eventually left out of the studyafter a preliminary evaluation phase included VIATRA2 (http://wiki.eclipse.org/VIATRA2) (very few search results on GitHub, replaced by a new version whichis currently under heavy development), CDO (http://eclipse.org/cdo/) (very fewresults outside of the tool’s own repository), QVTd (currently, only an incom-plete prototype is provided by the respective Eclipse project (https://projects.eclipse.org/projects/modeling.mmt.qvtd), and UML/Papyrus (UML models can be usedin both MDE and non-MDE projects - i.e. exclusively for documentation andcommunication purposes). Including a wider range of both Eclipse-based (e.g.Spoofax (http://strategoxt.org/Spoofax)) and non-Eclipse based (e.g. MetaEdit+ (http://www.metacase.com/), JetBrains MPS (http://www.jetbrains.com/mps/)) technologies andtools would clearly be an interesting direction for future work.

2.2 Mining GitHub for Files of Interest

Having identified the technologies of interest, the next step was to use GitHub’ssearch capabilities to find files strongly related to each technology (e.g. fileswith a .atl extension for ATL). GitHub’s search interface allows searching forfiles with specific extensions (across all public GitHub repositories), however italso requires a search term (that must appear in the content of the retrievedfiles) to also be included in the query. After manual experimentation we iden-tified the extensions and search terms in Table 1, which provided the largestnumber of relevant results for each technology. We also manually sampled thereturned results to ensure that no irrelevant files were returned. This was the

Table 1: File extensions and search terms per technology

Techn. Ext. Search Term Techn. Ext. Search Term

ATL atl rule QVTo qvto transformationEmfatic emf class Acceleo mtl templateEGL* egl var IncQuery eiq patternEugenia* ecore ”gmf.diagram” GMF gmfgraph figureEOL* eol var Xtext xtext grammarETL* etl transform Ecore ecore EClassEVL* evl context OCL ocl contextSirius odesign node Henshin henshin ruleMOFScript m2t texttransformation Kermeta kmt classXcore xcore class JET javajet jetEMFText cs syntaxdef Xpand xpt for

case for all technologies except EGL and Acceleo. Files with a .egl extensionthat contain the search term var can be either Epsilon Generation Language

Page 6: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

6 1000 Results/Query L

•  GitHub returns the number of files that meet the query's conditions – … but only details about 1,000 of them

•  Repository •  Path in the repository •  Snippet

•  In the first round we collected all repositories we could and we queried them again

Page 7: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

7 Search Results

4 Kolovos et. al.

model-to-text transformation templates or IBM’s Enterprise Generation Lan-guage (http://www.eclipse.org/edt/) programs. The latter are not of interest to thisstudy and were filtered out by also requiring that the returned snippet containsthe [% characters, which are used by Epsilon’s EGL to denote dynamic tem-plate sections. Similarly, files with a .mtl extension containing the search termtemplate can be Acceleo templates or Blender 3D material files. The latter areof no interest to this study and had to be filtered out in a similar fashion.

It is worth noting that GitHub’s search interface returns the number of filesthat meet the query’s conditions but only the details (i.e. repository, path inthe repository, snippet) of up to 1,000 files returned by the query (for example,the query for Ecore files (http://github.com/search?q=EClass+extension:ecore&type=Code), re-turns 15,038 files but the search interface allows only for the details of 1,000 ofthem to be retrieved). To retrieve details for additional files of interest, we col-lected all repositories returned by a first round of queries, and we queried theserepositories again for all file extensions of interest. The number of search resultsand the number of actual files that we managed to retrieve the details of aredisplayed in Table 2. This endeavour revealed that GitHub’s search interface

Table 2: Number of search results and retrieved files per technology

Technology Search

Results

Retrieved

Files

Technology Search

Results

Retrieved

Files

Ecore 15038 9629 QVTo 1247 1396Xpand 7005 4974 GMF 1138 1100Acceleo 6091 3158 Emfatic 525 510JET 5323 3682 Sirius 428 503OCL 3131 2418 EMFText 375 365Xtext 2272 1605 Kermeta 339 343ATL 1404 1481 Xcore 305 313Epsilon 1352 1765 IncQuery 187 192Henshin 1281 1142 MOFScript 34 34

is not particularly reliable. For example, after the dataset expansion phase wehave collected details for 1481 ATL files, 77 more than the number returnedby GitHub’s search interface in our first search (1404)). Similar discrepancieshave been observed for other technologies including Sirius, IncQuery, Kermeta,and Epsilon – however the scale of these discrepancies is not prohibitive for theextraction of useful insights. There are also technologies where we were able tocollect the details of a smaller number of files than the number returned byGitHub’s search interface – however, this is expected due to GitHub’s 1000-results-per-query limitation discussed above.

2.3 Collecting Commits

Having collected the details (i.e. repository, path and snippet) of files of interest,the next step was to use GitHub’s API to collect additional information about

Page 8: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

8 Commit Collection

•  Collected all commits for all files of interest – Commit time – Developer and committer ids/email

addresses •  Grouped them by developer – Used the committers' email addresses as IDs

Page 9: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

9 Unification

Eclipse MDE Technologies on GitHub 5

the commits in which each of these files was involved since its creation. Each“commit” API response contains information about the committer and the au-

thor of the commit (Git distinguishes between the author of a commit – i.e. theperson that did the job – and the person that performed the commit in the repos-itory). For each of these roles, the response contains the real name and emailaddress of the person, (optionally) their GitHub ID, and the commit date. Sincenot all commits have associated GitHub IDs (in fact, from the 74K commits wecollected in this study, more than 31K are not associated with GitHub IDs), inthe remainder of this work, we identify developers using their email addresses(as opposed to their GitHub ID).

2.4 Unification

The last step of the process involved unifying the collected information in theform of a model that conforms to the Ecore metamodel illustrated in Figure 1.In its current state, the model contains data about 1,928 repositories, 34,442 filesand 74,403 commits. We have made the model and its metamodel available underhttp: // github. com/ kolovos/ datasets/ tree/ master/ github-mde so that the results presentedin the next section can be independently reproduced and verified.

Fig. 1: The Unification Metamodel

3 Data Analysis and Results

Having collected and unified the data of interest in the form of a model con-forming to the Ecore metamodel of Figure 1, our next step was to analyse thedata and obtain insights related to the use of the MDE technologies involved inour study, both individually and as a whole.

3.1 Number of Relevant Files per Technology

We first report on the number of relevant files found across GitHub. Figure 2provides an overview of the obtained measurements. We can observe the domi-nance of Ecore which accounts for more than twice the files compared to the next

Metamodel and data available under http://github.com/kolovos/datasets/tree/master/github-mde

Page 10: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

10

6 Kolovos et. al.

technology (Xpand). We also observe that files related to model-to-text trans-formation languages (Xpand, Acceleo and JET) appear much more frequentlycompared to files related to model-to-model transformation (ATL, QVTo, Hen-shin, Kermeta) and model query (OCL, IncQuery) languages.

Ecore

(15038)

Xpa

nd(7005)

Acceleo

(6091)

JET

(5323)

OCL(3131)

Xtext

(2272)

Epsilon

(1765)

ATL(1481)

QVTo(1396)

Henshin

(1281)

GMF(1138)

Emfatic(525)

Sirius

(503)

EMFText(375)

Kermeta(343)

Xcore

(313)

IncQ

uery

(192)

MOFScript

(34)

0

5,000

10,000

15,000Num

ber

ofFiles

on

GitHub

Fig. 2: Number of files per technology

Ecore

(1342)

Xtext

(660)

Xpa

nd(231)

GMF(228)

Acceleo

(216)

JET

(164)

ATL(151)

Epsilon

(147)

Emfatic(104)

Xcore

(102)

QVTo(76)

Sirius

(75)

EMFText(58)

IncQ

uery

(45)

OCL(42)

Kermeta(19)

Henshin

(13)

MOFScript

(7)

0

500

1,000

1,500

Estim

ated

num

ber

ofG

itHub

Repositorie

s

Fig. 3: Number of repositories per tech-nology

3.2 Number of Repositories per Technology

Conceivably, a small number of repositories that contain a large number of filesrelated to a particular technology could lead to misleading interpretations of thechart in Figure 2. Therefore, in Figure 3 we present the estimated number ofrepositories in which files related to each technology appear.

To compute these estimates, we accept that the files related to each tech-nology for which we have been able to retrieve details comprise a representativesample of the entire population of files related to this technology. Consequently,we can assume that the ratio of repositories over files in the sample generalisesover the entire population. To estimate the error in this computation, we usedbinomial proportion confidence intervals.

We modelled the occurrence of a new, previously unseen repository in eachof the files in the sample as a Bernoulli trial, since files are independent as farthe repository that they belong to is concerned. We computed normal approxi-mation (Wald), exact binomial (Clopper-Pearson), Wilson score, adjusted Wald(Agresti-Coull) and Je↵reys intervals for 90%, 95% and 99% confidence levels.We observed that in all cases, normal approximation, exact binomial, Wilsonscore and Je↵reys intervals coincide or exhibit a di↵erence equal or less than1%. Figure 3 displays the normal (Wald) approximation intervals for 95% con-fidence level.

While Ecore continues to dominate the chart, there are a few interestingdi↵erences. First, Xtext and GMF (6th and 11th in Figure 2), climb up to 2nd

Page 11: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

11

6 Kolovos et. al.

technology (Xpand). We also observe that files related to model-to-text trans-formation languages (Xpand, Acceleo and JET) appear much more frequentlycompared to files related to model-to-model transformation (ATL, QVTo, Hen-shin, Kermeta) and model query (OCL, IncQuery) languages.

Ecore

(15038)

Xpa

nd(7005)

Acceleo

(6091)

JET

(5323)

OCL(3131)

Xtext

(2272)

Epsilon

(1765)

ATL(1481)

QVTo(1396)

Henshin

(1281)

GMF(1138)

Emfatic(525)

Sirius

(503)

EMFText(375)

Kermeta(343)

Xcore

(313)

IncQ

uery

(192)

MOFScript

(34)

0

5,000

10,000

15,000

Num

ber

ofFiles

on

GitHub

Fig. 2: Number of files per technology

Ecore

(1342)

Xtext

(660)

Xpa

nd(231)

GMF(228)

Acceleo

(216)

JET

(164)

ATL(151)

Epsilon

(147)

Emfatic(104)

Xcore

(102)

QVTo(76)

Sirius

(75)

EMFText(58)

IncQ

uery

(45)

OCL(42)

Kermeta(19)

Henshin

(13)

MOFScript

(7)

0

500

1,000

1,500

Estim

ated

num

ber

ofG

itHub

Repositorie

s

Fig. 3: Number of repositories per tech-nology

3.2 Number of Repositories per Technology

Conceivably, a small number of repositories that contain a large number of filesrelated to a particular technology could lead to misleading interpretations of thechart in Figure 2. Therefore, in Figure 3 we present the estimated number ofrepositories in which files related to each technology appear.

To compute these estimates, we accept that the files related to each tech-nology for which we have been able to retrieve details comprise a representativesample of the entire population of files related to this technology. Consequently,we can assume that the ratio of repositories over files in the sample generalisesover the entire population. To estimate the error in this computation, we usedbinomial proportion confidence intervals.

We modelled the occurrence of a new, previously unseen repository in eachof the files in the sample as a Bernoulli trial, since files are independent as farthe repository that they belong to is concerned. We computed normal approxi-mation (Wald), exact binomial (Clopper-Pearson), Wilson score, adjusted Wald(Agresti-Coull) and Je↵reys intervals for 90%, 95% and 99% confidence levels.We observed that in all cases, normal approximation, exact binomial, Wilsonscore and Je↵reys intervals coincide or exhibit a di↵erence equal or less than1%. Figure 3 displays the normal (Wald) approximation intervals for 95% con-fidence level.

While Ecore continues to dominate the chart, there are a few interestingdi↵erences. First, Xtext and GMF (6th and 11th in Figure 2), climb up to 2nd

Page 12: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

12

Eclipse MDE Technologies on GitHub 7

and 4th places in this chart. This is expected as model-to-text transformationstypically consist of many files (templates), while graphical and textual syntaxesare often limited to one file per language. In the same spirit, while model-to-texttransformation languages still appear to be more popular compared to M2Mlanguages, the gap is substantially smaller in this chart.

We also observe that OCL and Henshin (5th and 9th in Figure 2), slip downto the 15th and 17th position in this chart. This suggests that there are a fewrepositories that contain a large number of OCL constraints and Henshin trans-formations. Further analysis revealed that for OCL, two repositories (dresden-ocl/dresdenocl and regressiontesttool/rtt) account for nearly 50% of the OCLfiles on GitHub. Similarly, a single repository (CoWolf/CoWolf ) accounts formore than 80% of all Henshin files on GitHub.

3.3 Number of MDE-literate Developers

Overall, we have identified 2,195 developers (developers with unique email ad-dresses) who are responsible for at least one commit related to files of interestacross all technologies included in this study. The breakdown for each technol-ogy is presented in Figure 4. Compared to Figure 3, we don’t observe any majorchanges in the relative order of technologies. Estimates and confidence intervalswere computed in the same way used in Figure 3.

Ecore

(1481)

Xtext

(646)

Xpa

nd(308)

GMF(285)

Acceleo

(222)

JET

(205)

ATL(173)

Epsilon

(158)

Emfatic(115)

QVTo(104)

Sirius

(90)

Xcore

(77)

OCL(59)

EMFText(52)

IncQ

uery

(38)

Henshin

(24)

Kermeta(21)

MOFScript

(9)

0

500

1,000

1,500Estim

ated

num

ber

ofD

evelo

pers

Fig. 4: Estimated number of developersper technology

2003

(3)

2004

(208)

2005

(480)

2006

(539)

2007

(1700)

2008

(1180)

2009

(2602)

2010

(4402)

2011

(7654)

2012

(10511)

2013

(14560)

2014

(23979)

0

5,000

10,000

15,000

20,000

25,000

Com

mits

per

Year

Fig. 5: Number of commits per yearacross all technologies

3.4 Activity per Year

Figure 5 illustrates the number of commits that involve files of interest (acrossall technologies) over time. We observe a steep increase, from 4.4K commits in2010 to 24K commits in 20142. This can be attributed to di↵erent extents to the2We intentionally excluded results for 2015 as at the time of writing this paper we are only a few

months into the year.

Page 13: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

13

Eclipse MDE Technologies on GitHub 7

and 4th places in this chart. This is expected as model-to-text transformationstypically consist of many files (templates), while graphical and textual syntaxesare often limited to one file per language. In the same spirit, while model-to-texttransformation languages still appear to be more popular compared to M2Mlanguages, the gap is substantially smaller in this chart.

We also observe that OCL and Henshin (5th and 9th in Figure 2), slip downto the 15th and 17th position in this chart. This suggests that there are a fewrepositories that contain a large number of OCL constraints and Henshin trans-formations. Further analysis revealed that for OCL, two repositories (dresden-ocl/dresdenocl and regressiontesttool/rtt) account for nearly 50% of the OCLfiles on GitHub. Similarly, a single repository (CoWolf/CoWolf ) accounts formore than 80% of all Henshin files on GitHub.

3.3 Number of MDE-literate Developers

Overall, we have identified 2,195 developers (developers with unique email ad-dresses) who are responsible for at least one commit related to files of interestacross all technologies included in this study. The breakdown for each technol-ogy is presented in Figure 4. Compared to Figure 3, we don’t observe any majorchanges in the relative order of technologies. Estimates and confidence intervalswere computed in the same way used in Figure 3.

Ecore

(1481)

Xtext

(646)

Xpa

nd(308)

GMF(285)

Acceleo

(222)

JET

(205)

ATL(173)

Epsilon

(158)

Emfatic(115)

QVTo(104)

Sirius

(90)

Xcore

(77)

OCL(59)

EMFText(52)

IncQ

uery

(38)

Henshin

(24)

Kermeta(21)

MOFScript

(9)

0

500

1,000

1,500

Estim

ated

num

ber

ofD

evelo

pers

Fig. 4: Estimated number of developersper technology

2003

(3)

2004

(208)

2005

(480)

2006

(539)

2007

(1700)

2008

(1180)

2009

(2602)

2010

(4402)

2011

(7654)

2012

(10511)

2013

(14560)

2014

(23979)

0

5,000

10,000

15,000

20,000

25,000Com

mits

per

Year

Fig. 5: Number of commits per yearacross all technologies

3.4 Activity per Year

Figure 5 illustrates the number of commits that involve files of interest (acrossall technologies) over time. We observe a steep increase, from 4.4K commits in2010 to 24K commits in 20142. This can be attributed to di↵erent extents to the2We intentionally excluded results for 2015 as at the time of writing this paper we are only a few

months into the year.

Page 14: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

14

8 Kolovos et. al.

increasing popularity of Git and GitHub and to an increase of the use of EclipseMDE technologies in open-source software development over the last few years.

3.5 Last-updated Files per Year

On a related measurement, Figure 6 illustrates the number of files that werelast updated on di↵erent years (as with the previous chart, we omit data forthe current year). In this figure we observe that more than 40% of the files ofinterest can be considered as active as they have been updated at least once overthe last year.

2003

(1)

2004

(4)

2005

(47)

2006

(54)

2007

(372)

2008

(315)

2009

(1014)

2010

(1262)

2011

(3418)

2012

(5144)

2013

(5933)

2014

(13208)

0

5,000

10,000

Last

updated

file

sper

year

Fig. 6: Number of files last updated oneach year

Ecore

(715)

Xtext

(325)

Epsilon

(104)

GMF(103)

Acceleo

(95)

ATL(83)

Xpa

nd(73)

Emfatic(62)

Sirius

(60)

JET

(57)

Xcore

(51)

QVTo(45)

IncQ

uery

(28)

OCL(26)

EMFText(15)

Henshin

(11)

MOFScript

(6)

Kermeta(5)

0

200

400

600

Estim

ated

num

ber

ofActiv

eG

itHub

Repositorie

s

Fig. 7: Estimated number of activerepositories per technology

4 Threats to Validity

The following threats to the validity of the results reported in this paper havebeen identified:

– The results reported largely rely on the accuracy of the number of resultsreported by GitHub’s interface in response to search queries. As previouslydiscussed, there is evidence to suggest indications that GitHub’s search inter-face does not return all matching files for a query. In the most extreme case,while according to the GitHub search interface, 1352 Epsilon-related files ex-ist on GitHub, our extended search has revealed 1765 files. Smaller deviationsof similar nature were identified for ATL (1404/1481), QVTo (1247/1396)and Sirius (428/503). Substantial deviations can have significant ripple ef-fects in dependent metrics such as the number of (active) repositories pertechnology, the number of commits etc.

– Although we extensively sampled the results returned by GitHub’s searchinterface to avoid including irrelevant files in our unified model, it is con-ceivable that a (small) number of irrelevant files may have still found theirway in, thus skewing the results obtained in favour of specific technologies.

Page 15: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

15

8 Kolovos et. al.

increasing popularity of Git and GitHub and to an increase of the use of EclipseMDE technologies in open-source software development over the last few years.

3.5 Last-updated Files per Year

On a related measurement, Figure 6 illustrates the number of files that werelast updated on di↵erent years (as with the previous chart, we omit data forthe current year). In this figure we observe that more than 40% of the files ofinterest can be considered as active as they have been updated at least once overthe last year.

2003

(1)

2004

(4)

2005

(47)

2006

(54)

2007

(372)

2008

(315)

2009

(1014)

2010

(1262)

2011

(3418)

2012

(5144)

2013

(5933)

2014

(13208)

0

5,000

10,000

Last

updated

file

sper

year

Fig. 6: Number of files last updated oneach year

Ecore

(715)

Xtext

(325)

Epsilon

(104)

GMF(103)

Acceleo

(95)

ATL(83)

Xpa

nd(73)

Emfatic(62)

Sirius

(60)

JET

(57)

Xcore

(51)

QVTo(45)

IncQ

uery

(28)

OCL(26)

EMFText(15)

Henshin

(11)

MOFScript

(6)

Kermeta(5)

0

200

400

600

Estim

ated

num

ber

ofActiv

eG

itHub

Repositorie

s

Fig. 7: Estimated number of activerepositories per technology

4 Threats to Validity

The following threats to the validity of the results reported in this paper havebeen identified:

– The results reported largely rely on the accuracy of the number of resultsreported by GitHub’s interface in response to search queries. As previouslydiscussed, there is evidence to suggest indications that GitHub’s search inter-face does not return all matching files for a query. In the most extreme case,while according to the GitHub search interface, 1352 Epsilon-related files ex-ist on GitHub, our extended search has revealed 1765 files. Smaller deviationsof similar nature were identified for ATL (1404/1481), QVTo (1247/1396)and Sirius (428/503). Substantial deviations can have significant ripple ef-fects in dependent metrics such as the number of (active) repositories pertechnology, the number of commits etc.

– Although we extensively sampled the results returned by GitHub’s searchinterface to avoid including irrelevant files in our unified model, it is con-ceivable that a (small) number of irrelevant files may have still found theirway in, thus skewing the results obtained in favour of specific technologies.

Page 16: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

TABLE III% OF SHARED REPOSITORIES FOR EACH PAIR OF TECHNOLOGIES

ATL

Emfa

tic

Siriu

s

MO

FScr

ipt

Xco

re

EMFT

ext

QV

To

Acc

eleo

IncQ

uery

GM

F

Xte

xt

Ecor

e

OC

L

Hen

shin

Ker

met

a

JET

Xpa

nd

Epsi

lon

ATL n/a 11 0 1 0 5 4 9 0 11 20 83 3 1 0 3 7 16

Emfatic 16 n/a 0 1 0 4 6 7 0 34 9 85 1 0 0 1 10 58

Sirius 0 1 n/a 0 1 0 1 25 2 1 30 80 1 1 1 4 4 5

MOFScript 42 28 0 n/a 0 0 14 0 0 14 0 85 0 0 0 0 14 0

Xcore 0 0 0 0 n/a 0 0 1 1 1 42 36 1 0 0 4 5 3

EMFText 15 8 0 0 0 n/a 1 8 0 10 5 87 3 0 1 5 5 10

QVTo 9 9 1 1 0 1 n/a 23 0 34 22 85 2 1 0 11 26 13

Acceleo 6 3 8 0 0 2 8 n/a 0 11 14 53 3 0 0 2 5 7

IncQuery 2 2 4 0 4 0 0 2 n/a 4 17 73 2 2 0 0 6 4

GMF 7 15 0 0 0 2 11 10 0 n/a 14 89 1 2 0 10 18 21

Xtext 4 1 3 0 6 0 2 4 1 5 n/a 68 1 0 0 1 17 3

Ecore 9 6 4 0 2 3 4 8 2 15 33 n/a 2 0 1 5 11 9

OCL 14 4 2 0 4 4 4 16 2 9 23 80 n/a 0 7 9 0 2

Henshin 15 0 7 0 0 0 7 15 7 38 15 92 0 n/a 0 0 38 15

Kermeta 5 0 5 0 0 5 0 5 0 0 21 73 15 0 n/a 0 5 0

JET 3 1 1 0 3 1 5 3 0 15 4 48 2 0 0 n/a 6 4

Xpand 4 4 1 0 2 1 8 4 1 18 51 65 0 2 0 4 n/a 6

Epsilon 17 41 2 0 2 4 6 11 1 32 17 87 0 1 0 5 10 n/a

Fig. 6. Number of files last updated on each year

2003

(1)

2004

(4)

2005

(47)

2006

(54)

2007

(372)

2008

(315)

2009

(1014

)

2010

(1262

)

2011

(3418

)

2012

(5144

)

2013

(5933

)

2014

(1320

8)

0

5,000

10,000

Last

upda

ted

files

per

year

appears to be Acceleo (Acceleo: 25%, Xpand: 4%), whilefor projects using Xtext, Xpand is more popular (Xpand:17%, Acceleo: 4%). Although all four technologies areself-contained, integrate equally well with EMF and anypair should work equally well, there appears to be some

Fig. 7. Estimated number of active repositories per technology

Ecore

(715)

Xtext (32

5)

Epsilon

(104)

GMF(10

3)

Acceleo

(95)

ATL(83

)

Xpand

(73)

Emfatic

(62)

Sirius (60

)

JET

(57)

Xcore

(51)

QVTo (45)

IncQue

ry(28

)

OCL(26

)

EMFText (15

)

Henshi

n (11)

MOFScript

(6)

Kermeta

(5)

0

200

400

600

Estim

ated

num

ber

ofA

ctiv

eG

itHub

Rep

osito

ries

bias related to the developers of these technologies (Siriusand Acceleo are developed by Obeo, while Xtext andXpand are developed by Itemis);

• There is remarkably strong co-occurrence between Em-fatic and Epsilon (both of which are developed predom-

Page 17: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

17 Observations (1/2)

•  A healthy number of GitHub repositories (1,928)

•  A substantial community of 2,195 developers

•  Increasing number of commits on files related to these technologies over the last decade

Page 18: Assessing the Use of Eclipse MDE Technologies in Open-Source Software Projects

18 Observations (2/2)

•  Model-to-text transformation languages (XPand, Acceleo and JET) appear to be more widely used than model-to-model transformation languages (ATL, QVTo, Henshin)

•  Technologies led by industry (Ecore, Xtext, Xpand, GMF, Acceleo and JET) are more widely used than those developed in academia (Epsilon, ATL, Kermeta, Henshin).