early identification of future committers in open source software projects

22
Early Identification of Future Committers in Open Source Software Projects Akinori Ihara (NAIST, Yasutaka Kamei (Kyushu Univ. Masao Ohira (Wakayma Univ. Japan) Ahmed E. Hassan (Queen’s Univ. Naoyasu Ubayashi (Kyushu Univ., Kenichi Matsumoto (NAIST, Japan) Future Committer?

Upload: sailqu

Post on 12-Apr-2017

54 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Early Identification of Future Committers in Open Source Software Projects

Early Identification of Future Committers in Open Source Software Projects

Akinori Ihara(NAIST, Japan)

Yasutaka Kamei(Kyushu Univ. Japan)

Masao Ohira(Wakayma Univ. Japan)

Ahmed E. Hassan(Queen’s Univ. Canada)

Naoyasu Ubayashi(Kyushu Univ., Japan)

Kenichi Matsumoto(NAIST, Japan)

Future Committer?

Page 2: Early Identification of Future Committers in Open Source Software Projects

What is Committer?

patchesVersion ControlSystem

Patch verification

committers developers

Page 3: Early Identification of Future Committers in Open Source Software Projects

What is Committer’s Work?

patchesVersion ControlSystem

Assigning tasksTriaging tasksRequirements

Requirements

Requirements

Requirements

HIGH

Requirements

Requirements

Requirements

Requirements

LOW

Requirements

Requirements Requirements

Requirements

Requirements

Requirements

Requirements

Requirements

Requirements

committers developers

Patch verification Advising coding

Page 4: Early Identification of Future Committers in Open Source Software Projects

What is Committer’s Work?

patchesVersion ControlSystem

Assigning tasksTriaging tasks

Patch verification

Requirements

Requirements

Requirements

Requirements

HIGH

Requirements

Requirements

Requirements

Requirements

LOW

Requirements

Requirements Requirements

Requirements

Requirements

Requirements

Requirements

Requirements

Requirements

Advising coding

Sometimes, leave the project!!

Too few!!

committers developers

Page 5: Early Identification of Future Committers in Open Source Software Projects

How do they get new committers?

Candidate committer

Committer Community

developers

Over 10,000 developers

Page 6: Early Identification of Future Committers in Open Source Software Projects

How do they get new committers?

Candidate committer

Committer Community

developers

Over 10,000 developers

GUIDELINE

Page 7: Early Identification of Future Committers in Open Source Software Projects

The evaluated activitiesNeed more

contribution!

Time

Comment

Patch creation

Page 8: Early Identification of Future Committers in Open Source Software Projects

Time

The evaluated activitiesGood Works! Contribute as Committer!

Comment

Patch creation

Comment

Patch creation

Comment

Patch creation

Comment

Patch creation

Comment

Patch creation

Comment

Patch creation

Comment

Patch creation

Comment

Patch creation

Page 9: Early Identification of Future Committers in Open Source Software Projects

Future Committers and Developers

Future committers

53 51

CommentPatch creation

Commit

VCS

Patch creation

Page 10: Early Identification of Future Committers in Open Source Software Projects

Future Committers and Developers

Future committers

53 51

Developers

8,964 12,287

CommentPatch creation

Commit

VCS

Patch creation

Comment Comment Patch creation

Page 11: Early Identification of Future Committers in Open Source Software Projects

Future Committers and Developers

Future committers

53 51

Developers

8,964 12,287

Existing committers

36 96

CommentPatch creation

Commit

VCS

Patch creation

Comment Comment Patch creation

Commit

VCS

Patch creation Comment Patch

creation

Page 12: Early Identification of Future Committers in Open Source Software Projects

Future Committers and Developers

Future committers

53 51

Developers

8,964 12,287

Existing committers

36 96

CommentPatch creation

Commit

VCS

Patch creation

Comment Comment Patch creation

Commit

VCS

Patch creation Comment Patch

creation

Page 13: Early Identification of Future Committers in Open Source Software Projects

Research Questions

RQ1Are there any differences in the activities of future committers and developers?

RQ2Which developer activities lead to early promotion to a committer role?

RQ3How accurate is a committer-identification model built using developer activities

Page 14: Early Identification of Future Committers in Open Source Software Projects

RQ1Are there any differences in the activities of future committers and developers?

The amount of activities by future committers is higher than developers.

Patch submissionComment submission

Page 15: Early Identification of Future Committers in Open Source Software Projects

Activity period before Committer Activity period before Committer

RQ2Which developer activities lead to early promotion to a committer role?

A developer who has contributed for one year should become a committer [Bird‘07]

Rapidly-promoted committer

Regularly-promoted committer

Rapid

Regular

% n

umbe

r of c

omm

itter

s

% n

umbe

r of c

omm

itter

s

Page 16: Early Identification of Future Committers in Open Source Software Projects

RQ2Which developer activities lead to early promotion to a committer role?

RQ2-1Is there a difference in activities of rapidly-promoted committers and the activities of regularly-promoted committers?

The # activities performed by rapidly-promoted committers is more than regularly-promoted committers.

Patch submissionComment submission

Regular Rapid Regular Rapid Regular Rapid Regular Rapid

Page 17: Early Identification of Future Committers in Open Source Software Projects

RQ2Which developer activities lead to early promotion to a committer role?

2 4 6 8 10 12 14

020

40

Index

aaa$

patc

h

2 4 6 8 10 12 14

020

40

Index

aaa$

com

men

t Dev1@Eclipse platformfor 10 months

0 5 10 15 20 25 30

05

1020

Index

aaa$

patc

h

0 5 10 15 20 25 30

05

1020

Index

aaa$

com

men

t Dev2@Mozilla Firefox

for 17 months

activity period (month)5 10 15 20 25 30

020

400

1020

0

1 2 3 4 5 6

010

25

Indexsa

nza$

patc

h

1 2 3 4 5 6

010

25

Indexsa

nza$

com

men

t

Dev3@Eclipse platform3015

0

1 2 3 4 5 6

05

10

Index

sanz

a$pa

tch

1 2 3 4 5 6

05

10

Index

sanz

a$co

mm

ent

Dev4@Mozilla Firefox

010

1 2 3 4 5 6activity period (month)

the number of patches the number of comments

Rapidly-promoted committerRegularly-promoted committer

RQ2-2What do regularly-promoted committers do more than the rapidly-promoted committers?

Regularly-promoted committers have actively worked for 1-1.5 years before they became committers.

1 2 3 4 5 62 4 6 8 10 12 14

Regular Rapid

Page 18: Early Identification of Future Committers in Open Source Software Projects

RQ3How accurate is a committer-identification model built using developer activities

Precision Recall F1

Random Forest

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.2

0.50.8

Threshold

0.2

0.50.8

Threshold

AUC Precision Recall F1 AUC

Sum PatchMed Num Patch

Patches creation

Sum PatchMed Num Patch

Comment

Periodsampled same number of developers as committers

Page 19: Early Identification of Future Committers in Open Source Software Projects

RQ3How accurate is a committer-identification model built using developer activities

Random ForestSumNumPatchMedNumPatch

Patches creation

SumNumPatchMedNumPatch

Comment

Periodsampled same number of developers as committers

Precision Recall F10.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.2

0.50.8

Threshold

0.2

0.50.8

Threshold

AUC Precision Recall F1 AUC

The committer prediction model has higher accuracy than the random predictor.

Page 20: Early Identification of Future Committers in Open Source Software Projects

RQ3How accurate is a committer-identification model built using developer activities

rank Eclipse Firefox

1 SumNumComment Period

2 Period SumNumComment

3 SumNumPatch SumNumPatch

4 MedNumComment MedNumPatch

5 MedNumPatch MedNumComment

Page 21: Early Identification of Future Committers in Open Source Software Projects

DiscussionActivities after becoming a committerLevel UP!

Rapidly-promoted committers actively worked more than regular- promoted committers after their promotion.

CommitsComment submission

Regular Rapid Regular Rapid Regular Rapid Regular Rapid

Page 22: Early Identification of Future Committers in Open Source Software Projects