reda: a web-based visualization tool for analyzing modern code review dataset

Post on 02-Dec-2014

173 Views

Category:

Presentations & Public Speaking

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

ReDA(http://reda.naist.jp/) is a web-based visualization tool for analyzing Modern Code Review (MCR) datasets for large Open Source Software (OSS) projects. MCR is a commonly practiced and lightweight inspection of source code using a support tool such as Gerrit system. Recently, mining code review history of such systems has received attention as a potentially effective method of ensuring software quality. However, due to increasing size and complexity of softwares being developed, these datasets are becoming unmanageable. ReDA aims to assist researchers of mining code review data by enabling better understand of dataset context and identifying abnormalities. Through real-time data interaction, users can quickly gain insight into the data and hone in on interesting areas to investigate. A video highlighting the main features can be found at: http://youtu.be/ fEoTRRas0U

TRANSCRIPT

A Web-based Visualization Tool for Analyzing Modern Code Review Dataset

Patanamon ThongtanunamXin Yang, Norihiro Yoshida, Raula Gaikovina Kula,

Ana Erika Camargo Cruz, Kenji Fujiwara, Hajimu Iida

1

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

2

Modern Code Review

Developer

3

Modern Code Review

Developer

Gerrit Code ReviewCode change

3

Modern Code Review

Developer

Gerrit Code ReviewCode change

3

Modern Code Review

Reviewers

Developer

Gerrit Code ReviewCode change

3

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

3

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Patch Sets (Commits)

3

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Patch Sets (Commits)

Modified Files

3

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Review Scores

Patch Sets (Commits)

Modified Files

3

Modern Code Review

Determine Quality

Find DefectsComment

Reviewers

Developer

Gerrit Code ReviewCode change

Comments

Review Scores

Patch Sets (Commits)

Modified Files

3

Modern Code Review

4

Modern Code Review

New &

Rich

source

4

Modern Code Review

New &

Rich

sourceCode Quality

4

Modern Code Review

New &

Rich

sourceCode Quality Popular

4

Modern Code Review

?

New &

Rich

sourceCode Quality Popular

Reliable?

4

Modern Code Review

?

New &

Rich

sourceCode Quality Popular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

4

Modern Code Review

?

New &

Rich

sourceCode Quality Popular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]Can we easily find a characteristic in the dataset?

4

is for facilitating researchers

5

is for facilitating researchers

Extracting a dataset

5

is for facilitating researchers

Extracting a dataset

Showing basic statistical summary

5

is for facilitating researchers

Extracting a dataset

Showing basic statistical summary

Observing interesting patterns & Identify problems

5

provides three visualizations

6

Review Statistic

provides three visualizations

6

Review Statistic Activity Statistic

provides three visualizations

6

Review Statistic Activity Statistic

Contributor Activities

provides three visualizations

6

Demo http://reda.naist.jp

7

Android Open Source Software Project (AOSP) dataset* Data was captured from October 2008 to January 2012

Example Findings8

Android Open Source Software Project (AOSP) dataset

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peaks in Graphs

Data Observation using #1

9

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peaks in Graphs

Data Observation using #1

9

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes

10

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes

11

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes Reviews related to VCS transactions

Gerrit Code Review

E.g., Merging branch transactions

11

Review Statistic

Activity Statistic

Review Statistic

Reviews without code changes

Unusual Peak in Graphs

Data Observation by #1

Submit wrong code version

Accidentally submit changes

Gerrit Code Review

Developers mistakes Reviews related to VCS transactions

Gerrit Code Review

E.g., Merging branch transactions

F1: 10% of all reviews were created but not for code review

11

Review Statistic

Data Observation using #2

12

Review Statistic

Missing data

Data Observation using #2

12

Review Statistic

Missing dataGit and Gerrit servers

were down.Reported by Google Developers

Incomplete dataset can bias the results.

Data Observation using #2

12

Review Statistic

Missing dataGit and Gerrit servers

were down.Reported by Google Developers

Incomplete dataset can bias the results.

Data Observation using #2F2: Review history in AOSP is incomplete.

12

Activity Statistic

Data Observation using #3

Contributor

Contributor Activities13

Activity Statistic

Data Observation using #3

Low activity number

Contributor

Contributor Activities13

Activity Statistic

Data Observation using #3

Weekend

Contributor

Contributor Activities13

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities13

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities13

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities

Main Contributor

13

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities

Main Contributor

13

Activity Statistic

Data Observation using #3

F3: Developers usually do code reviews on weekdays.

Weekend

Contributor

Contributor Activities

Main Contributor

F4: Main contributors are from Google and Android teams.

13

14

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

14

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

14

Modern Code Review

?

New &

Rich

source

Code QualityPopular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

Can we easily find a characteristic in the dataset?

4

Review Statistic Activity Statistic

Contributors Activities

provides three visualizations

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

14

Modern Code Review

?

New &

Rich

source

Code QualityPopular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

Can we easily find a characteristic in the dataset?

4

F2: Review history in AOSP is incomplete.

Review Statistic Activity Statistic

Contributors Activities

provides three visualizations

Modern Code Review is a tool-based, and occurs regularly in practice nowadays at companies and OSS

projects [Bacchelli et. al.]

Gerrit Code Review

Rietveld

F1: 10% of all reviews were created but not for code review

Example Findingsfrom Android Open Source Software Project (AOSP) dataset* Data was captured from October 2008 to January 2012

Gerrit

F3: Developers usually do code reviews on weekdays.

F4: Main contributors are from Google and Android teams.

14

Modern Code Review

?

New &

Rich

source

Code QualityPopular

Reliable?

“Publicly available data from support tools is a rich source on mining, various potential perils

should also be taken into consideration.” [Kalliamvakou et. al.]

Can we easily find a characteristic in the dataset?

4

top related