msr presentation final
TRANSCRIPT
1
An Empirical Study of the Copy and Paste Behavior during Development
Tarek M. Ahmed Weiyi Shang Ahmed E. Hassan
3
//Code fragment Copy
//Code fragment
Paste
Copy & Paste leads to clones
Large number of code clones
4
Clone detection tools detect C&P after it is performed
Source Code
Clone Detection
C&P
C&P
Code Clones
5
There exists no large scale C&P study on developers
Controlled ExperimentSmall number Experienced only
6
Larger scale study exists on regular users
• Regular computer users• Non-Software development tasks
A large scale C&P study is needed for software development tasks
8
How to detect C&P in Eclipse UDC
User ID What Kind … Description
104526 Executed Command org.eclipse.ui.edit.copy
User performs Copy
9
How to detect C&P in Eclipse UDC
User ID What Kind … Description
104526 Executed Command org.eclipse.ui.edit.copy
104526 Executed Command org.eclipse.ui.edit.paste
User performs Paste
10
Our study focuses on users who frequently and actively use Eclipse
Create Development
Sessions
Find Active Sessions
Find Frequent Users
13
Average number of C&P per hour isdifferent from recent studies
2.73 16
Our finding Previous finding
#Commands > Average #Commands + 1 Standard deviation
#Commands > Average #Commands + 2 Standard deviation
Heavy Editing
Sessions
V. Heavy Editing
Sessions
11.39 13.18
14
Do IDE users follow the same C&P patterns as regular users?
How do IDE users copy and paste code across different file formats?
15
Do IDE users follow the same C&P patterns as regular users?
How do IDE users copy and paste code across different file formats?
16
Copy
//Code fragment
Paste
//Code fragment
Inside the same file Between different files
Copy
//Code fragment
//Code fragment
Paste
17
Copy
//Code fragment
Paste
//Code fragment
Inside the same file Between different files
Copy
//Code fragment
//Code fragment
Paste
IDE users often C&P within the same file
80%IDE 20%
IDE23%
Regular 77%Regular
21
IDE users often perform relay on C&P
A
C
B
DA
B
C
A B C
Repeat Distribution
Relay
32%Regular
9%IDE
36%Regular
1%IDE
2%Regular
33%IDE
Others
30%Regular
57%IDE
22
C&P behavior of IDE users is different from regular users
IDE Users
Higher Within Higher Relay Lower Distribution
Regular Users
Higher Between Lower Relay Higher Distribution
Eclipse IDE requires tailored C&P support tools that differfrom regular users’ C&P tools
23
Do IDE users follow the same C&P patterns as regular users?
How do IDE users copy and paste code across different file formats?
There are major differences between C&P behavior of Eclipse IDE users and C&P behavior of regular users.
24
Do IDE users follow the same C&P patterns as regular users?
How do IDE users copy and paste code across different file formats?
There are major differences between C&P behavior of Eclipse IDE users and C&P behavior of regular users.
There exists large number of C&P between editors, hence, clone detection techniques would consider detect clones across different languages.