reda: a web-based visualization tool for analyzing modern code review dataset

51
A Web-based Visualization Tool for Analyzing Modern Code Review Dataset Patanamon Thongtanunam Xin Yang, Norihiro Yoshida, Raula Gaikovina Kula, Ana Erika Camargo Cruz, Kenji Fujiwara, Hajimu Iida 1

Upload: nara-institute-of-science-and-technology

Post on 02-Dec-2014

173 views

Category:

Presentations & Public Speaking


0 download

DESCRIPTION

ReDA(http://reda.naist.jp/) is a web-based visualization tool for analyzing Modern Code Review (MCR) datasets for large Open Source Software (OSS) projects. MCR is a commonly practiced and lightweight inspection of source code using a support tool such as Gerrit system. Recently, mining code review history of such systems has received attention as a potentially effective method of ensuring software quality. However, due to increasing size and complexity of softwares being developed, these datasets are becoming unmanageable. ReDA aims to assist researchers of mining code review data by enabling better understand of dataset context and identifying abnormalities. Through real-time data interaction, users can quickly gain insight into the data and hone in on interesting areas to investigate. A video highlighting the main features can be found at: http://youtu.be/ fEoTRRas0U

TRANSCRIPT

Page 1: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Patanamon ThongtanunamXin Yang, Norihiro Yoshida, Raula Gaikovina Kula,

Ana Erika Camargo Cruz, Kenji Fujiwara, Hajimu Iida

1

Page 2: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

2

Page 3: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Developer

3

Page 4: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Developer

Gerrit Code ReviewCode change

3

Page 5: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Developer

Gerrit Code ReviewCode change

3

Page 6: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Reviewers

Developer

Gerrit Code ReviewCode change

3

Page 7: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

3

Page 8: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Patch Sets (Commits)

3

Page 9: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Patch Sets (Commits)

Modified Files

3

Page 10: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Review Scores

Patch Sets (Commits)

Modified Files

3

Page 11: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Comments

Review Scores

Patch Sets (Commits)

Modified Files

3

Page 12: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

4

Page 13: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

New &

Rich

source

4

Page 14: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

New &

Rich

sourceCode Quality

4

Page 15: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

New &

Rich

sourceCode Quality Popular

4

Page 16: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

?

New &

Rich

sourceCode Quality Popular

Reliable?

4

Page 17: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

?

New &

Rich

sourceCode Quality Popular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

4

Page 18: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review

?

New &

Rich

sourceCode Quality Popular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]Can we easily find a characteristic in the dataset?

4

Page 19: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

is for facilitating researchers

5

Page 20: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

is for facilitating researchers

Extracting a dataset

5

Page 21: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

is for facilitating researchers

Extracting a dataset

Showing basic statistical summary

5

Page 22: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

is for facilitating researchers

Extracting a dataset

Showing basic statistical summary

Observing interesting patterns & Identify problems

5

Page 23: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

provides three visualizations

6

Page 24: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

provides three visualizations

6

Page 25: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic Activity Statistic

provides three visualizations

6

Page 26: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic Activity Statistic

Contributor Activities

provides three visualizations

6

Page 27: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Demo http://reda.naist.jp

7

Android Open Source Software Project (AOSP) dataset* Data was captured from October 2008 to January 2012

Page 28: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Example Findings8

Android Open Source Software Project (AOSP) dataset

Page 29: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peaks in Graphs

Data Observation using #1

9

Page 30: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peaks in Graphs

Data Observation using #1

9

Page 31: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes

10

Page 32: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes

11

Page 33: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes Reviews related to VCS transactions

Gerrit Code Review

E.g., Merging branch transactions

11

Page 34: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes Reviews related to VCS transactions

Gerrit Code Review

E.g., Merging branch transactions

F1: 10% of all reviews were created but not for code review

11

Page 35: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Data Observation using #2

12

Page 36: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Missing data

Data Observation using #2

12

Page 37: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Missing dataGit and Gerrit servers

were down.Reported by Google Developers

Incomplete dataset can bias the results.

Data Observation using #2

12

Page 38: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic

Missing dataGit and Gerrit servers

were down.Reported by Google Developers

Incomplete dataset can bias the results.

Data Observation using #2F2: Review history in AOSP is incomplete.

12

Page 39: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

Contributor

Contributor Activities13

Page 40: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

Low activity number

Contributor

Contributor Activities13

Page 41: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

Weekend

Contributor

Contributor Activities13

Page 42: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities13

Page 43: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities13

Page 44: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities

Main Contributor

13

Page 45: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities

Main Contributor

13

Page 46: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities

Main Contributor

F4: Main contributors are from Google and Android teams.

13

Page 47: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

14

Page 48: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

14

Page 49: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

14

Modern Code Review

?

New &

Rich

source

Code QualityPopular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

Can we easily find a characteristic in the dataset?

4

Page 50: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Review Statistic Activity Statistic

Contributors Activities

provides three visualizations

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

14

Modern Code Review

?

New &

Rich

source

Code QualityPopular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

Can we easily find a characteristic in the dataset?

4

Page 51: ReDa: A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

F2: Review history in AOSP is incomplete.

Review Statistic Activity Statistic

Contributors Activities

provides three visualizations

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

F1: 10% of all reviews were created but not for code review

Example Findingsfrom Android Open Source Software Project (AOSP) dataset* Data was captured from October 2008 to January 2012

Gerrit

F3: Developers usually do code reviews on weekdays.

F4: Main contributors are from Google and Android teams.

14

Modern Code Review

?

New &

Rich

source

Code QualityPopular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

Can we easily find a characteristic in the dataset?

4