andrea mocci: beautiful design, beautiful coding at i t.a.k.e. unconference 2015

@andreamocci

R AE E LV

Beautiful CodingBeautiful Design

Software Design

There are two ways of constructing a software design.“

One way is to make it so simple that there are obviously no deficiencies

There are two ways of constructing a software design.“

One way is to make it so simple that there are obviously no deficiencies

and the other way is to make it so complicated that there are no

obvious deficiencies.

“ The first method is far more difficult.

One way is to make it soobviously

“One way is to make it soobviously

The first method is far more difficult.

It demands the same skill, devotion, insight, and even inspiration as the discovery of the

simple physical laws which underlie the complex phenomena of nature.

“One way is to make it soobviously

It demands the same skill, devotion, insight, and even inspiration as the discovery of the

simple physical laws which underlie the complex phenomena of nature.

http://earthobservatory.nasa.gov/IOTD/view.php?id=5693 (C) Nasa

Software Evolution

R AE E LV

City Metaphor

Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution

Decoding a city: ArgoUML

City Metaphor

The city metaphor

domain mapping

class building

package district

system city

City Metaphor

domain mapping

class building

The city metaphor

domain mapping

class building

package district

system city

City Metaphor

domain mapping

class building

package district

The city metaphor

domain mapping

class building

package district

system city

City Metaphor

domain mapping

class building

package district

system city

The city metaphor

domain mapping

class building

package district

system city

City Metaphor

domain mapping

class building

package district

system city

number of methods (NOM) heightnumber of attributes (NOA) base size

The city metaphor

domain mapping

class building

package district

system city

City Metaphor

domain mapping

class building

package district

system city

number of methods (NOM) heightnumber of attributes (NOA) base size

nesting level color

ArgoUML

Skyscrapers

ArgoUML

Skyscrapers

Parking Lots

ArgoUML

Skyscrapers

Parking Lots Office

Buildings

Code cities

System Language NOP NOC kLOC

Azureus Java 457 4734 274

ArgoUML Java 144 2542 137

JHotDraw Java 72 998 30

iText Java 149 1250 80

Jmol Java 105 1032 85

JDK 1.5 Java 137 4715 160

Moose Smalltalk 278 961 32

Jun Smalltalk 288 2236 351

CodeCity Smalltalk 129 291 18

ScumVM C++ 18 1331 105

JDK 1.5

MooseRichard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution

Code cities

Jmol Java 105 1032 85

JDK 1.5 Java 137 4715 160

ScumVM C++ 18 1331 105

MooseRichard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution

Code cities

Jmol Java 105 1032 85

JDK 1.5 Java 137 4715 160

ScumVM C++ 18 1331 105

Software Systems as Cities: A Controlled Experiment

Richard Wettel and Michele Lanza

REVEAL @ Faculty of Informatics

University of Lugano

{richard.wettel,michele.lanza}@usi.ch

Romain Robbes

PLEIAD @ DCC

University of Chile

rrobbes@dcc.uchile.cl

ABSTRACTSoftware visualization is a popular program comprehension tech-nique used in the context of software maintenance, reverse engineer-ing, and software evolution analysis. While there is a broad rangeof software visualization approaches, only few have been empiri-cally evaluated. This is detrimental to the acceptance of softwarevisualization in both the academic and the industrial world.

We present a controlled experiment for the empirical evaluationof a 3D software visualization approach based on a city metaphorand implemented in a tool called CodeCity. The goal is to provideexperimental evidence of the viability of our approach in the contextof program comprehension by having subjects perform tasks relatedto program comprehension. We designed our experiment based onlessons extracted from the current body of research. We conductedthe experiment in four locations across three countries, involving41 participants from both academia and industry. The experimentshows that CodeCity leads to a statistically significant increase interms of task correctness and decrease in task completion time. Wedetail the experiment we performed, discuss its results and reflecton the many lessons learned.

Categories and Subject DescriptorsD.2.7 [Distribution, Maintenance and Enhancement]: Restruc-turing, reverse engineering, and reengineering

General TermsExperimentation, Human Factors, Measurement

KeywordsSoftware visualization, Empirical validation

1. INTRODUCTIONSoftware visualization is defined by Stasko et al. as “The use

of the crafts of typography, graphic design, animation, and cine-matography with modern human-computer interaction and com-puter graphics technology to facilitate both the human understand-ing and effective use of computer software” [13].

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ICSE ’11, May 21–28 2011, Honolulu, HawaiiCopyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00.

It has earned a reputation as an effective program comprehensiontechnique, and is widely used in the context of software maintenance,reverse engineering and reengineering [2].

The last decade has witnessed a proliferation of software visual-ization approaches, aimed at supporting a broad range of softwareengineering activities. As a consequence, there is a growing need forobjective assessments of visualization approaches to demonstratetheir effectiveness. Unfortunately, only few software visualizationapproaches have been empirically validated so far, which is detri-mental to the development of the field [2].

One of the reasons behind the shortage of empirical evaluation liesin the difficulty of performing a controlled experiment for softwarevisualization. On the one hand, the variety of software visualizationapproaches makes it almost impossible to reuse the design of anexperiment from the current body of research. On the other hand, or-ganizing and conducting a proper controlled experiment is, in itself,a difficult endeavor, which can fail in many ways: data which doesnot support a true hypothesis, conclusions which are not statisticallysignificant due to insufficient data points, etc.

We present a controlled experiment for the empirical evaluationof a software visualization approach based on a city metaphor [16].The aim is to show that our approach, implemented in a tool calledCodeCity, is at least as effective and efficient as the state of thepractice at supporting reverse engineering and program comprehen-sion activities. We conceived a set of tasks and measured both thecorrectness of the task solutions and the task completion time. Weconducted the experiment in four locations across three countries,with participants from both industry and academia, with a broadrange of experience.

In this paper we make the following two major contributions:

1. we detail the experiment design and its operation, reportingon a number of lessons learned regarding the many pitfallsthat this type of experiment entails, and

2. we discuss the results of the experiment, which show thatour approach is a viable alternative to existing non-visualtechniques.

Structure of the paper. In Section 2 we describe our softwarevisualization approach. In Section 3 we present the related work,followed by a list of desiderata in Section 4, extracted from anextensive study of the existing body of research. In Section 5 wedescribe the design of our experiment and in Section 6 we detail theexperiment’s operational phase. In Section 7 we present how wecollected the data on which we performed the analysis presentedin Section 8. In Section 9 we present the results of the experiment,followed by a presentation of the threats to validity in Section 10,and the conclusions in Section 11.

Software Systems as Cities: A Controlled Experiment

Richard Wettel and Michele Lanza

REVEAL @ Faculty of Informatics

University of Lugano

{richard.wettel,michele.lanza}@usi.ch

Romain Robbes

PLEIAD @ DCC

University of Chile

rrobbes@dcc.uchile.cl

ABSTRACTSoftware visualization is a popular program comprehension tech-nique used in the context of software maintenance, reverse engineer-ing, and software evolution analysis. While there is a broad rangeof software visualization approaches, only few have been empiri-cally evaluated. This is detrimental to the acceptance of softwarevisualization in both the academic and the industrial world.

We present a controlled experiment for the empirical evaluationof a 3D software visualization approach based on a city metaphorand implemented in a tool called CodeCity. The goal is to provideexperimental evidence of the viability of our approach in the contextof program comprehension by having subjects perform tasks relatedto program comprehension. We designed our experiment based onlessons extracted from the current body of research. We conductedthe experiment in four locations across three countries, involving41 participants from both academia and industry. The experimentshows that CodeCity leads to a statistically significant increase interms of task correctness and decrease in task completion time. Wedetail the experiment we performed, discuss its results and reflecton the many lessons learned.

Categories and Subject Descriptors

It has earned a reputation as an effective program comprehensiontechnique, and is widely used in the context of software maintenance,reverse engineering and reengineering [2].

The last decade has witnessed a proliferation of software visual-ization approaches, aimed at supporting a broad range of softwareengineering activities. As a consequence, there is a growing need forobjective assessments of visualization approaches to demonstratetheir effectiveness. Unfortunately, only few software visualizationapproaches have been empirically validated so far, which is detri-mental to the development of the field [2].

One of the reasons behind the shortage of empirical evaluation liesin the difficulty of performing a controlled experiment for softwarevisualization. On the one hand, the variety of software visualizationapproaches makes it almost impossible to reuse the design of anexperiment from the current body of research. On the other hand, or-ganizing and conducting a proper controlled experiment is, in itself,a difficult endeavor, which can fail in many ways: data which doesnot support a true hypothesis, conclusions which are not statisticallysignificant due to insufficient data points, etc.

We present a controlled experiment for the empirical evaluationof a software visualization approach based on a city metaphor [16].The aim is to show that our approach, implemented in a tool calledCodeCity, is at least as effective and efficient as the state of the

Tasks. Our choice of tasks may not reflect real reverse engineer-ing situations. We could not match our analysis with any of theexisting frameworks, because they do not support design problemassessment and, in addition, are either too low-level (e.g., the ques-tions asked by practitioners during a programming change task bySillito et al. [12]), or biased towards dynamic analysis tools (e.g.,Pacione’s framework [8]). To alleviate this threat, we complementedour tasks with usage scenarios and targeted users.

Object systems. The representativeness of our object systems isanother threat. We chose to perform the experiment with two differ-ent object systems, in spite of the added complexity in organizingthe experiment and analyzing the data introduced by a second inde-pendent variable. The two object systems we chose are well-knownopen-source systems of different, realistic sizes and of orthogonalapplication domains. It is not known how appropriate they are forthe reverse-engineering tasks we designed, but the variation in thesolutions to the same task shows that the systems are quite different.

Experimenter effect. One of the experimenters is also the authorof the approach and of the tool. This may have influenced anysubjective aspect of the experiment. One instance of this threatis that the task solutions may not have been graded correctly. Tomitigate this threat, the three authors built a model of the answersand a grading scheme and then reached consensus. Moreover, thegrading was performed in a similar manner and two of the threeexperimenters graded the solutions blinded, i.e., without knowingthe treatments (e.g., tool) used to obtain the solutions. Even if wetried to mitigate this threat extensively, we cannot exclude all thepossible influences of this factor on the results of the experiment.

11. CONCLUSIONWe presented a controlled experiment aimed at evaluating our

software visualization approach based on a city metaphor. Wedesigned our experiment from a list of desiderata built during anextensive study of the current body of research. We recruited largesamples of our target population, with subjects from both academiaand industry with a broad range of experience.

The main result of the experiment is that our approach leadsto an improvement, in both correctness (+24%) and completiontime (-12%), over the state-of-the-practice exploration tools. Thisresult is statistically significant. We believe this is due to both thevisualization as such, as well as the metaphor used by CodeCity, butwe can not measure the exact contribution of each factor.

Apart from an aggregated analysis, we performed a detailed anal-ysis of each task, which provided a number of insights on the typeof tasks that our approach best supports. Unsurprisingly, in thecase of focused tasks, i.e., tasks which require very precise answers,CodeCity did not perform better than Eclipse+Excel. Surprisingly,for most of these tasks, our approach managed to be on a par withExcel. As for the tasks that benefit from an overview of the system,CodeCity constantly outperformed the baseline.

Finally, we designed our experiment with repeatability in mind.In [20], we provided the complete raw and processed data of thisexperiment (i.e., the pre-experiment, in-experiment and debriefingquestionnaires, solution oracles and grading systems, correctionscores and measured completion time) to allow reviewers to betterevaluate the experiment’s design and results, and fellow researchersto repeat the experiment, or reuse its design as a base for their own.

Acknowledgments. We gratefully acknowledge Radu Marinescu, MirceaLungu, Alberto Bacchelli, Lile Hattori, Oscar Nierstrasz, Serge Demeyer,Fabrizio Perin, and Quinten Soetens for helping us with the experiment. Wethank our participants: the developers in Bologna and Lugano, the SCG inBern, and the students in Lugano and Antwerp. We thank the EuropeanSmalltalk User Group (http://esug.org) for financial support.

12. REFERENCES[1] B. Cornelissen, A. Zaidman, B. V. Rompaey, and A. van Deursen.

Trace visualization for program comprehension: A controlledexperiment. In Proceedings of the 17th IEEE International Conferenceon Program Comprehension, pages 100–109. IEEE CS Press, 2009.

[2] R. Koschke. Software visualization in software maintenance, reverseengineering, and re-engineering: a research survey. Journal ofSoftware Maintenance, 15(2):87–109, 2003.

[3] C. F. J. Lange and M. R. V. Chaudron. Interactive views to improvethe comprehension of UML models - an experimental validation. InICPC ’07: Proceedings of the 15th International Conference onProgram Comprehension, pages 221–230. IEEE CS Press, 2007.

[4] M. Lanza and S. Ducasse. Polymetric views — a lightweight visualapproach to reverse engineering. Transactions on SoftwareEngineering (TSE), 29(9):782–795, Sept. 2003.

[5] H. Levene. Robust tests for equality of variances. In I. Olkin, editor,Contributions to Probability and Statistics: Essays in Honor of HaroldHotelling, pages 278–292. Stanford University Press, 1960.

[6] A. Marcus, D. Comorski, and A. Sergeyev. Supporting the evolution ofa software visualization tool through usability studies. In Proceedingsof the 13th International Workshop on Program Comprehension,pages 307–316. IEEE CS Press, 2005.

[7] R. Marinescu. Detection strategies: Metrics-based rules for detectingdesign flaws. In ICSM ’04: Proceedings of the 20th IEEEInternational Conference on Software Maintenance, pages 350–359.IEEE CS Press, 2004.

[8] M. J. Pacione, M. Roper, and M. Wood. A novel software visualisationmodel to support software comprehension. In Proceedings of WCRE2004, pages 70–79. IEEE CS Press, 2004.

[9] M. D. Penta, R. Stirewalt, and E. Kraemer. Designing your nextempirical study on program comprehension. In ICPC ’07:Proceedings of the 15th International Conference on ProgramComprehension, pages 281–285. IEEE CS Press, 2007.

[10] J. Quante. Do dynamic object process graphs support programunderstanding? - a controlled experiment. In ICPC ’08: Proceedingsof the 16th IEEE International Conference on ProgramComprehension, pages 73–82. IEEE CS Press, 2008.

[11] S. Shapiro and M. Wilk. An analysis of variance test for normality(complete samples). Biometrika, 52(3–4):591–611, 1965.

[12] J. Sillito, G. C. Murphy, and K. De Volder. Questions programmersask during software evolution tasks. In SIGSOFT ’06/FSE-14:Proceedings of the 14th ACM SIGSOFT international symposium onFoundations of Software Engineering, pages 23–34. ACM Press, 2006.

[13] J. T. Stasko, J. Domingue, M. H. Brown, and B. A. Price, editors.Software Visualization - Programming as a Multimedia Experience.The MIT Press, 1998.

[14] M.-A. D. Storey, K. Wong, and H. A. Müller. How do programunderstanding tools affect how programmers understand programs? InWCRE ’97: Proceedings of the Fourth Working Conference on ReverseEngineering, pages 12–21. IEEE CS Press, 1997.

[15] R. Wettel and M. Lanza. Program comprehension through softwarehabitability. In Proceedings of ICPC 2007, pages 231–240. IEEE CSPress, 2007.

[16] R. Wettel and M. Lanza. Visualizing software systems as cities. InProceedings of VISSOFT 2007, pages 92–99. IEEE CS Press, 2007.

[17] R. Wettel and M. Lanza. CodeCity: 3D visualization of large-scalesoftware. In Proceedings of ICSE 2008, Tool Demo Track, pages921–922. ACM Press, 2008.

[18] R. Wettel and M. Lanza. Visual exploration of large-scale systemevolution. In Proceedings of the 15th Working Conference on ReverseEngineering, pages 219–228. IEEE CS Press, 2008.

[19] R. Wettel and M. Lanza. Visually localizing design problems withdisharmony maps. In Proceedings of the 4th Symposium on SoftwareVisualization, pages 155–164. ACM Press, 2008.

[20] R. Wettel, M. Lanza, and R. Robbes. Empirical validation ofCodeCity: A controlled experiment. Technical Report 2010/05,University of Lugano, June 2010. http://www.inf.usi.ch/research_publication.htm?id=60.

[21] C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, andA. Wesslén. Experimentation in software engineering: an introduction.Kluwer Academic Publishers, 2000.

Coarse-grained age map: ArgoUML

Age MapsRichard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution

Glimpse in the past

entities colored according to age

Principle of age map

age (number of survived versions)

new-bornveteran

Refining the granularity

NOA = 2

class C

class Cm4

Granularity

GranularityRichard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution

Fine-grained age map: JHotDraw

library packages:

org.w3c.dom

(classes) AllTests

CH.ifa.draw.standard

CH.ifa.draw.frameworkCH.ifa.draw.figures

CH.ifa.draw.test

class DrawApplication

in CH.ifa.draw.application

class StandardDrawingView

in CH.ifa.draw.standard.

GranularityRichard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution

Buildings in age maps

age: 1 2 3 4 5 6 7 8

stable

youngoldvery old

rarely updated

highly unstable

very old

updated often,

rather unstable

Design Inspection

Few Violations

Lots of Violations!

Review (Code Review)

Review of Code Review

Fig. 6. View of the city with all the activities

SC+STSC+IE

IE+STSC+ST+IE

PresentPast

SCIEST

Fig. 5. Aging Process: Example in the Timeline

In the “present” interval (i.e., the one selected by the user),colors are at their default saturation. In the “past” interval,instead, the color saturation fades. At the end of this interval,the nodes have the default color, i.e., light gray.

V. TELLING EVOLUTIONARY STORIES

To assess the usefulness of our approach we inspected therecent history of the PHARO system. In the following wepresent four stories that narrate the evolution of the systemsupported by our blended view.

A. Those Awkward Neighbors

By selecting the full available timespan of the data we obtaina visualization that displays all the activities that involved thePHARO system over a period of five months. This enablesto obtain a comprehensive view of the system evolution andderive long-term considerations and properties. Figure 6 showsthe overall view of the available data.

One interesting example is represented by what we call theawkward neighbors, i.e.,, big but silent packages that havelittle or no activity.

Multiple Concerns

BeautifulCoding

Smalltalk IDE

Interaction Data

developer

interactions

Interaction Data

developer

interactions

MetaUser InterfaceLow-Level

Opening/closing a window Activating a window, i.e., window in focus Resizing/moving/minimize/maximize a window class

Mouse button up/down Scroll wheel up/down Mouse move Mouse-out/in Keystroke pressed

Smalltalk IDE

Recorder Analyzer …

events

sessionsdevelopersrec. time

windows

197h13’ 54”5,052,386

13,691

Visualization

Color LegendUI View

OpenActivate

Minimize / ExpandResize / MoveClose Minimized backbone

Activity TimelineNavigationInspection

EditUnderstanding

Vertical LinesExplicit pause (sub-session)Implicit pause (idle)

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

SessionStart

Duration ofpause between two

sub-sessions[d.hh:mm:ss]

Explicit pause(sub-sessions)

Implicit pause(idle time) Minimization

CommitSessionEnd

UI View

Activity Timeline

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

TOPTra

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

Track 2 (container)

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

Track 2 (container)

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

MainTrack 2 (container)

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

Window Interactions

Minimized

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

Window Interactions

Minimized

Color LegendUI View

OpenActivate

EditUnderstanding

Commit

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

Development Activities

Telling Development Stories

Home Sweet Home

Window Plague

Curing The Plague

Inspection Valley

CategorizingDevelopmentSessions

BOTTOM

Dominant Tracks

A track with a privileged role in the development.

i.e., with predominant focus time and concentration of edit events.

Dominant Tracks

- Single Track - Multi Track- Fragmented Track

BOTTOM

Dominant Tracks

- Single Track- Multi Track - Fragmented Track

BOTTOM

Dominant Tracks

- Single Track- Multi Track- Fragmented Track

BOTTOM

Dominant Tracks Track Flow

Describes the way the developer alternates from different window tracks

- Sequential Flow - Ping-Pong

- Sequential Flow- Ping-Pong

Fragmented 40%

Single-Track 39%

Multi-Track 14%

Fragmented 40%

Single-Track 39%

Multi-Track 14%

Ping-Pong 16%

Sequential 76%

>50% sessions are Single-Track

>65% sessions are Fragmented

>50% sessions are Single-Track

>65% sessions are Fragmented

All prefer Sequential flow

Has 43% Ping-Pong sessions

Program understanding:Challenge for the 1990s

In the Program Understanding Project at IBM's Re-search Division, work began in late 1986 on toolswhich could help programmers in two key areas: staticanalysis (reading the code) and dynamic analysis (run-ning the code). The work is reported in the companionpapers by Cleveland and by Pazel in this issue. Thehistory and background which motivated and whichled to the start of this research on tools to assistprogrammers in understanding existing program codeis reported here.

" If the poor workman hates his tools, the goodworkman hates poor tools. The work of theworkingman is, in a sense, defined by his tool-witness the way in which the tool is so often taken tosymbolize the worker: the tri-squarefor the carpenter,the trowel for the mason, the transit for the surveyor,the camera for the photographer, the hammer for thelaborer, and the sickle for the farmer.

"Working with defective or poorly designed tools,even the finest craftsman is reduced to producinginferior work, and is thereby reduced to being aninferior craftsman. No craftsman, if he aspires to thehighest work in his profession, will accept such tools;and no employer, if he appreciates the quality ofwork, will ask the craftsman to accept them. "I

Today a variety of motivators are causing corpora-tions to invest in software tools to increase softwareproductivity, including: (1) increased demand forsoftware, (2) limited supply of software engineers, (3)rising expectations of support from software engi-neers, and (4) reduced hardware costs.' A key moti-

294 CORBI

by T. A. Corbi

vator for software tools in the 1990swill be the resultof having software evolve over the previous decadesfrom several-thousand-line, sequential programmingsystems into multimillion-line, multitasking "busi-ness-critical" systems. As the programming systemswritten in the 1960s and 1970s continue to mature,the focus for software tools will shift from tools thathelp develop new programming systems to tools thathelp us understand and enhance aging programmingsystems.

In the 1970s, the work of Belady and Lehman"?strongly suggested that all large programs wouldundergo significant change during the in-servicephase of their life cycle, regardless of the a prioriintentions of the organization. Clearly, they wereright. As an industry, we have continued to growand change our large software systems to:

• Remove defects• Address new requirements• Improve design and/or performance• Interface to new programs• Adjust to changes in data structures or formats• Exploit new hardware and software featuresAs we extended the lifetimes of our systems bycontinuing to modify and enhance them, we alsoe Copyright 1989by International BusinessMachines Corporation.Copying in printed form for private use is permitted withoutpayment of royalty provided that (I) each reproduction is donewithout alteration and (2) the Journalreference and IBMcopyrightnotice are included on the first page. The title and abstract, but noother portions, of this paper may be copied or distributed royaltyfree without further permission by computer-based and otherinformation-service systems. Permission to republish any otherportion of this paper must be obtained from the Editor.

IBM SYSTEMS JOURNAL, VOL 28, NO 2, 1989

…up to 50%

navigate

user interface

program understanding

Program understanding:Challenge for the 1990s

In the Program Understanding Project at IBM's Re-search Division, work began in late 1986 on toolswhich could help programmers in two key areas: staticanalysis (reading the code) and dynamic analysis (run-ning the code). The work is reported in the companionpapers by Cleveland and by Pazel in this issue. Thehistory and background which motivated and whichled to the start of this research on tools to assistprogrammers in understanding existing program codeis reported here.

" If the poor workman hates his tools, the goodworkman hates poor tools. The work of theworkingman is, in a sense, defined by his tool-witness the way in which the tool is so often taken tosymbolize the worker: the tri-squarefor the carpenter,the trowel for the mason, the transit for the surveyor,the camera for the photographer, the hammer for thelaborer, and the sickle for the farmer.

"Working with defective or poorly designed tools,even the finest craftsman is reduced to producinginferior work, and is thereby reduced to being aninferior craftsman. No craftsman, if he aspires to thehighest work in his profession, will accept such tools;and no employer, if he appreciates the quality ofwork, will ask the craftsman to accept them. "I

Today a variety of motivators are causing corpora-tions to invest in software tools to increase softwareproductivity, including: (1) increased demand forsoftware, (2) limited supply of software engineers, (3)rising expectations of support from software engi-neers, and (4) reduced hardware costs.' A key moti-

294 CORBI

by T. A. Corbi

vator for software tools in the 1990swill be the resultof having software evolve over the previous decadesfrom several-thousand-line, sequential programmingsystems into multimillion-line, multitasking "busi-ness-critical" systems. As the programming systemswritten in the 1960s and 1970s continue to mature,the focus for software tools will shift from tools thathelp develop new programming systems to tools thathelp us understand and enhance aging programmingsystems.

In the 1970s, the work of Belady and Lehman"?strongly suggested that all large programs wouldundergo significant change during the in-servicephase of their life cycle, regardless of the a prioriintentions of the organization. Clearly, they wereright. As an industry, we have continued to growand change our large software systems to:

• Remove defects• Address new requirements• Improve design and/or performance• Interface to new programs• Adjust to changes in data structures or formats• Exploit new hardware and software featuresAs we extended the lifetimes of our systems bycontinuing to modify and enhance them, we alsoe Copyright 1989by International BusinessMachines Corporation.Copying in printed form for private use is permitted withoutpayment of royalty provided that (I) each reproduction is donewithout alteration and (2) the Journalreference and IBMcopyrightnotice are included on the first page. The title and abstract, but noother portions, of this paper may be copied or distributed royaltyfree without further permission by computer-based and otherinformation-service systems. Permission to republish any otherportion of this paper must be obtained from the Editor.

IBM SYSTEMS JOURNAL, VOL 28, NO 2, 1989

I Know What You Did Last SummerAn Investigation of How Developers Spend Their Time

Roberto Minelli, Andrea Mocci and Michele LanzaREVEAL @ Faculty of Informatics — University of Lugano, Switzerland

Abstract—Developing software is a complex mental activity,requiring extensive technical knowledge and abstraction capa-bilities. The tangible part of development is the use of tools toread, inspect, edit, and manipulate source code, usually throughan IDE (integrated development environment). Common claimsabout software development include that program comprehensiontakes up half of the time of a developer, or that certain UI(user interface) paradigms of IDEs offer insufficient support todevelopers. Such claims are often based on anecdotal evidence,throwing up the question of whether they can be corroboratedon more solid grounds.

We present an in-depth analysis of how developers spend theirtime, based on a fine-grained IDE interaction dataset consistingof ca. 740 development sessions by 18 developers, amounting to200 hours of development time and 5 million of IDE events. Wepropose an inference model of development activities to preciselymeasure the time spent in editing, navigating and searching forartifacts, interacting with the UI of the IDE, and performingcorollary activities, such as inspection and debugging. We reportseveral interesting findings which in part confirm and reinforcesome common claims, but also disconfirm other beliefs aboutsoftware development.

I. INTRODUCTION

Software development is a complex activity that requiresboth technical knowledge and extensive abstraction capabili-ties [1]. Even if the outcome of software development is code,the development process is far from being just code writing[2]. In fact, software systems are so large and complex [3]that developers need to build links between the source codeand their mental model (e.g., [4], [5], [6]).

Constructing the mental model is the base for programcomprehension, and has a key role in the development process:Researchers estimated that it absorbs half of the time of adeveloper [7], [8], [9]. Such a claim, however, is mainly basedon anecdotal evidence, throwing up the question of whether itcan be corroborated or confuted on more solid grounds.

While being a fundamental part of software development, itis unclear how program comprehension is supported by IDEs(Integrated Development Environments): IDEs offer varioustools and facilities to support the development process, like(1) Code Editors, (2) Code Browsers, and (3) Debuggers[10], [11]. None of these components is dedicated to programcomprehension. Instead, comprehension emerges from thecomplex interleaving of such activities. Moreover, researchersdiscovered that some UI paradigms (e.g., windows- or tabs-based IDEs) may negatively influence development, hinder-ing comprehension and generally developer productivity [12].While this claim is intuitively convincing, there is no quanti-tative evidence on how much time is spent on simply fiddlingwith the UI of an IDE.

While interacting with IDEs, developers generate a largeamount of data [13], [14]. These interactions happen at differ-ent levels of abstraction: they can be conceptual events, likeadding a method to a class, or low level events like pressingkeys or moving the mouse to navigate between entities. Whileresearchers believe that this data is valuable [14], [15], mostof these interactions are not captured.

Our aim is to record interaction data and measure the timeeffectively devoted to different activities. With this data, weprovide insights on the distribution of development activitiesand we quantitatively answer questions like:

1) What is the effective time spent in program comprehen-sion/understanding? What about the other activities likeediting, inspecting and navigating through code entities?

2) How much time do developers spend in fiddling with theUI of an IDE?

3) What is the impact of the fragmentation of the develop-ment flow?

We present an in-depth analysis on how developers spendtheir time. We collected fine-grained interaction data usingDFLOW [16], our non-intrusive profiler for the PHARO IDE1.Our dataset consists of 740 development sessions of 18 devel-opers. DFLOW collected about 200 hours of development time,amounting to more than 5 million of IDE events. We proposean inference model of development activities to preciselymeasure the time spent in editing, navigating and searching forartifacts, interacting with the UI of the IDE, and performingactivities such as inspection and debugging. Among our find-ings, we discovered that program comprehension absorbs moretime than generally assumed, and that fiddling with the UI ofIDEs can substantially hinder the productivity of developers.We make the following contributions:

• An in-depth analysis of how developers spend their time;• An inference model of development activities to precisely

measure the time of different activities;• A comparison between our findings and the common

claims and beliefs available in the literature;• A brief presentation of DFLOW, the tool with which we

collect interaction data.Structure of the Paper: Section II presents our dataset

and the target IDE. In Section III we explain how we makesense of the data and present the results, and in Section IVwe illustrate our findings. Section V discusses the related workwhile Section VI concludes our work.

1See http://pharo.org

Searchstarts

Searchends

ActiveWindows t

Windowactivated

Out / Inin the IDE

Windowactivated

W1 W2 W3 W2 W4

Methodsaved

>RT >RT

tMS1 MS2 KS4 MS3 MS4 MS5KS1 KS2 KS3

tA1 A3 A4 A5 A6 A7A2

Editing

Understanding

Navigation

User Interface

Outside the IDE

Time spent outside the IDE

Understanding time

Number of times outside the IDE

Understanding time

Number of times outside the IDE

Understanding time

User Interface time

Fragmentation: Outside the IDE

User Interface time

library packages:

org.w3c.dom

(classes) AllTests

CH.ifa.draw.test

library packages:

org.w3c.dom

(classes) AllTests

CH.ifa.draw.test

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

SC+STSC+IE

IE+STSC+ST+IE

PresentPast

SCIEST

library packages:

org.w3c.dom

(classes) AllTests

CH.ifa.draw.test

10:20 20:12

3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33

@andreamocciR AE E LV

@michelelanza

@richardwettel

@yuriy_tymchuk

@robertominelli

SC+STSC+IE

IE+STSC+ST+IE

PresentPast

SCIEST

https://creativecommons.org/licenses/by/3.0/

Some slides based on http://www.slideshare.net/wettel/visual-exploration-of-largescale-evolving-software

andrea mocci: beautiful design, beautiful coding at i t.a.k.e. unconference 2015

Software

t.a.k.e public relations

beautiful in my eyes – beautiful girls with beautiful...

james lewis: microservices - systems that are #neverdone at...

adi bolboacă: architecture for disaster resistant systems...

interno, internazionale comunitario e...

alex bolboacă: why you should start using docker at i...

andrei petcu: rocket vs docker: battle for the linux...

cyrille martraire: living documentation jumpstart at i...

abstract data types - massachusetts institute of...

andrew hall: power up: learn how to recharge your energy bar...

bdd with cucumber-jvm as presented at i t.a.k.e....

mocci – it – rif session 5 – paper 999

cyrille martraire: monoids, monoids everywhere! at i...

la principessa che amava i film horror – a. de santa, d....

coordinatori: giammarco mocci, giorgia bodini, raffaele ......

problematiche attuali in medicina del lavoro: stress, burn...

8 mocci cibien 2009

igor popov: mutation testing at i t.a.k.e. unconference 2015

beautiful one beautiful one i love beautiful one i adore...

walking skeleton i t.a.k.e. 2015