a live coding editor · prototype is very low fidelity and has some basic usabil-ity flaws (most...

69
by Björn Heinen Thesis at the Media Computing Group Prof. Dr. Jan Borchers Computer Science Department RWTH Aachen University Thesis advisor: Second examiner: Registration date: 12.06.2012 Submission date: 12.09.2012 A Live Coding Editor Prof. Dr. Bernhard Rumpe Prof. Dr. Jan Borchers

Upload: others

Post on 15-Oct-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

byBjoumlrn Heinen

Thesis at theMedia Computing GroupProf Dr Jan BorchersComputer Science DepartmentRWTH Aachen University

Thesis advisor

Second examiner

Registration date 12062012Submission date 12092012

A Live Coding Editor

Prof Dr Bernhard Rumpe

Prof Dr Jan Borchers

iii

I hereby declare that I have created this work completely onmy own and used no other sources or tools than the oneslisted and that I have marked any citations accordingly

Hiermit versichere ich dass ich die vorliegende Arbeitselbstandig verfasst und keine anderen als die angegebe-nen Quellen und Hilfsmittel benutzt sowie Zitate kenntlichgemacht habe

Aachen September2012Bjorn Heinen

v

Contents

Abstract ix

Acknowledgements xi

1 Introduction 1

2 Related Work 5

3 First User Study 11

31 The Prototypes 11

311 Interface Version 1 12

312 Interface Version 2 13

313 Interface Version 3 14

314 Interface Version 4 14

315 Interface Version 5 14

32 Course of the User Study 15

33 Results 16

331 Further Comments and Suggestionsfor Improvement 20

vi Contents

332 Conclusions 21

4 Implementation 27

41 First Approach (discharged) 27

42 Basic Evaluation Technique 28

43 How Values Are Being Calculated 29

44 Problems 31

5 Second User Study 35

51 Course of the User Study 35

52 Results 37

521 Debugging Times 37

522 System Usability Scale 38

523 Qualitative Results 39

524 Discussion 41

6 Summary and future work 43

61 Summary 43

62 Future Work 44

A Code for the Second User Study 47

B System Usability Scales 49

Bibliography 53

vii

List of Figures

11 A Live Coding Editor presented by Bret Vic-tor [Victor] 2

12 The key parameter has been changed 3

21 rdquoA screenshot of the Rehearse editor (1)The function declaration parameter namesand current values (2) a statement that hasbeen executed and (3) the result of execu-tion (4) an undone statement (5) the currentlinerdquo [Brandt et al] 6

22 A screenshot of the plug-in presentedby [Edwards 2004] 7

31 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 22

32 Version 2 The evaluated version of the codeis on the right-hand side but is ranged right 23

33 Version 3 Small pop-ups indicate the valueof a variable 23

34 Version 4 The evaluated version of a line isinserted right behind the line itself 24

viii List of Figures

35 Version 5 Only the values of the variablesare in between the lines 25

36 Interface after the first user study 26

41 Screenshot of Brackets with our extension 34

51 Box plot for the different task completion times 38

52 Box plot for the SUS-values with and with-out extension 39

53 Box plot for the additional points on thequestionnaire 40

61 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 45

B1 SUS questionnaire for debugging usingBrackets 50

B2 SUS questionnaire for the Live Coding exten-sion 51

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 2: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

iii

I hereby declare that I have created this work completely onmy own and used no other sources or tools than the oneslisted and that I have marked any citations accordingly

Hiermit versichere ich dass ich die vorliegende Arbeitselbstandig verfasst und keine anderen als die angegebe-nen Quellen und Hilfsmittel benutzt sowie Zitate kenntlichgemacht habe

Aachen September2012Bjorn Heinen

v

Contents

Abstract ix

Acknowledgements xi

1 Introduction 1

2 Related Work 5

3 First User Study 11

31 The Prototypes 11

311 Interface Version 1 12

312 Interface Version 2 13

313 Interface Version 3 14

314 Interface Version 4 14

315 Interface Version 5 14

32 Course of the User Study 15

33 Results 16

331 Further Comments and Suggestionsfor Improvement 20

vi Contents

332 Conclusions 21

4 Implementation 27

41 First Approach (discharged) 27

42 Basic Evaluation Technique 28

43 How Values Are Being Calculated 29

44 Problems 31

5 Second User Study 35

51 Course of the User Study 35

52 Results 37

521 Debugging Times 37

522 System Usability Scale 38

523 Qualitative Results 39

524 Discussion 41

6 Summary and future work 43

61 Summary 43

62 Future Work 44

A Code for the Second User Study 47

B System Usability Scales 49

Bibliography 53

vii

List of Figures

11 A Live Coding Editor presented by Bret Vic-tor [Victor] 2

12 The key parameter has been changed 3

21 rdquoA screenshot of the Rehearse editor (1)The function declaration parameter namesand current values (2) a statement that hasbeen executed and (3) the result of execu-tion (4) an undone statement (5) the currentlinerdquo [Brandt et al] 6

22 A screenshot of the plug-in presentedby [Edwards 2004] 7

31 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 22

32 Version 2 The evaluated version of the codeis on the right-hand side but is ranged right 23

33 Version 3 Small pop-ups indicate the valueof a variable 23

34 Version 4 The evaluated version of a line isinserted right behind the line itself 24

viii List of Figures

35 Version 5 Only the values of the variablesare in between the lines 25

36 Interface after the first user study 26

41 Screenshot of Brackets with our extension 34

51 Box plot for the different task completion times 38

52 Box plot for the SUS-values with and with-out extension 39

53 Box plot for the additional points on thequestionnaire 40

61 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 45

B1 SUS questionnaire for debugging usingBrackets 50

B2 SUS questionnaire for the Live Coding exten-sion 51

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 3: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

v

Contents

Abstract ix

Acknowledgements xi

1 Introduction 1

2 Related Work 5

3 First User Study 11

31 The Prototypes 11

311 Interface Version 1 12

312 Interface Version 2 13

313 Interface Version 3 14

314 Interface Version 4 14

315 Interface Version 5 14

32 Course of the User Study 15

33 Results 16

331 Further Comments and Suggestionsfor Improvement 20

vi Contents

332 Conclusions 21

4 Implementation 27

41 First Approach (discharged) 27

42 Basic Evaluation Technique 28

43 How Values Are Being Calculated 29

44 Problems 31

5 Second User Study 35

51 Course of the User Study 35

52 Results 37

521 Debugging Times 37

522 System Usability Scale 38

523 Qualitative Results 39

524 Discussion 41

6 Summary and future work 43

61 Summary 43

62 Future Work 44

A Code for the Second User Study 47

B System Usability Scales 49

Bibliography 53

vii

List of Figures

11 A Live Coding Editor presented by Bret Vic-tor [Victor] 2

12 The key parameter has been changed 3

21 rdquoA screenshot of the Rehearse editor (1)The function declaration parameter namesand current values (2) a statement that hasbeen executed and (3) the result of execu-tion (4) an undone statement (5) the currentlinerdquo [Brandt et al] 6

22 A screenshot of the plug-in presentedby [Edwards 2004] 7

31 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 22

32 Version 2 The evaluated version of the codeis on the right-hand side but is ranged right 23

33 Version 3 Small pop-ups indicate the valueof a variable 23

34 Version 4 The evaluated version of a line isinserted right behind the line itself 24

viii List of Figures

35 Version 5 Only the values of the variablesare in between the lines 25

36 Interface after the first user study 26

41 Screenshot of Brackets with our extension 34

51 Box plot for the different task completion times 38

52 Box plot for the SUS-values with and with-out extension 39

53 Box plot for the additional points on thequestionnaire 40

61 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 45

B1 SUS questionnaire for debugging usingBrackets 50

B2 SUS questionnaire for the Live Coding exten-sion 51

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 4: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

vi Contents

332 Conclusions 21

4 Implementation 27

41 First Approach (discharged) 27

42 Basic Evaluation Technique 28

43 How Values Are Being Calculated 29

44 Problems 31

5 Second User Study 35

51 Course of the User Study 35

52 Results 37

521 Debugging Times 37

522 System Usability Scale 38

523 Qualitative Results 39

524 Discussion 41

6 Summary and future work 43

61 Summary 43

62 Future Work 44

A Code for the Second User Study 47

B System Usability Scales 49

Bibliography 53

vii

List of Figures

11 A Live Coding Editor presented by Bret Vic-tor [Victor] 2

12 The key parameter has been changed 3

21 rdquoA screenshot of the Rehearse editor (1)The function declaration parameter namesand current values (2) a statement that hasbeen executed and (3) the result of execu-tion (4) an undone statement (5) the currentlinerdquo [Brandt et al] 6

22 A screenshot of the plug-in presentedby [Edwards 2004] 7

31 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 22

32 Version 2 The evaluated version of the codeis on the right-hand side but is ranged right 23

33 Version 3 Small pop-ups indicate the valueof a variable 23

34 Version 4 The evaluated version of a line isinserted right behind the line itself 24

viii List of Figures

35 Version 5 Only the values of the variablesare in between the lines 25

36 Interface after the first user study 26

41 Screenshot of Brackets with our extension 34

51 Box plot for the different task completion times 38

52 Box plot for the SUS-values with and with-out extension 39

53 Box plot for the additional points on thequestionnaire 40

61 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 45

B1 SUS questionnaire for debugging usingBrackets 50

B2 SUS questionnaire for the Live Coding exten-sion 51

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 5: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

vii

List of Figures

11 A Live Coding Editor presented by Bret Vic-tor [Victor] 2

12 The key parameter has been changed 3

21 rdquoA screenshot of the Rehearse editor (1)The function declaration parameter namesand current values (2) a statement that hasbeen executed and (3) the result of execu-tion (4) an undone statement (5) the currentlinerdquo [Brandt et al] 6

22 A screenshot of the plug-in presentedby [Edwards 2004] 7

31 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 22

32 Version 2 The evaluated version of the codeis on the right-hand side but is ranged right 23

33 Version 3 Small pop-ups indicate the valueof a variable 23

34 Version 4 The evaluated version of a line isinserted right behind the line itself 24

viii List of Figures

35 Version 5 Only the values of the variablesare in between the lines 25

36 Interface after the first user study 26

41 Screenshot of Brackets with our extension 34

51 Box plot for the different task completion times 38

52 Box plot for the SUS-values with and with-out extension 39

53 Box plot for the additional points on thequestionnaire 40

61 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 45

B1 SUS questionnaire for debugging usingBrackets 50

B2 SUS questionnaire for the Live Coding exten-sion 51

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 6: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

viii List of Figures

35 Version 5 Only the values of the variablesare in between the lines 25

36 Interface after the first user study 26

41 Screenshot of Brackets with our extension 34

51 Box plot for the different task completion times 38

52 Box plot for the SUS-values with and with-out extension 39

53 Box plot for the additional points on thequestionnaire 40

61 Version 1 The evaluated version of the codeis on the right-hand side and keeps its struc-ture and syntax highlighting 45

B1 SUS questionnaire for debugging usingBrackets 50

B2 SUS questionnaire for the Live Coding exten-sion 51

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 7: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

ix

Abstract

Live Coding seeks to provide an environment that helps programmers to under-stand the internal behavior of a program with the help of examples The goal ofthis thesis is to create an editor that takes exemplary parameters for a function andcontinuously executes the current version of the code with those parameters Tohelp the programmer understand the control flow of the program not only is theaccording return value of the function continually updated also the environmentshows what happens with the exemplary parameters inside the code between in-put and output A qualitative study with 6 users has been conducted to find thebest way of displaying the link between the code and the constantly updated eval-uations The result an environment that keeps a traditional editor on the left-handside and an evaluated version of the current code (keeping syntax highlighting andindents) on the right-hand side has been implemented in form of an extension forthe open source editor Brackets Finally a study with 20 users has been conductedto find out if there is an improvement in the debugging speed of experienced pro-grammers Although the user study yielded non-significant results there is evi-dence that there is an improvement in the debugging speed and that significantresults can be obtained

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 8: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

xi

Acknowledgements

First of all I would like to thank everybody who participated in my user studiesYour valuable informations and feedback enlarged my opportunities

Furthermore Jan-Peter Kramer deserves my thanksWithout whom this work would consist of nothing but blanks

I would like to thank all my good friends tooWhose long-standing amity I am looking forward to

Last but not least special thanks to my family particularly my parentsAnd unfortunately this is where my English ends

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 9: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

1

Chapter 1

Introduction

Most of todaysrsquos IDEs make a clear separation betweenediting the code and debugging it To switch from pro- Overhead results

from the separationbetween editing anddebugging

gramming to debugging and vice versa usually theprogrammer has to use another mode in the IDE This waynot only looses the programmer a significant amount oftime through the emerging overhead but the working flowgets interrupted too [Edwards 2004]

Permanently having to switch between modes has anotherdrawback The probability of loosing time because the pro- Wasted development

time throughIgnorance Time andFix Time

grammer continues coding after a bug has been introducedincreases considerably [Saff and Ernst 2004b] reported onthis phenomenon They said that larger Ignorance Times(the time between introducing a bug and becoming awareof it) yield larger Fix Times (the time between becomingaware of a bug and fixing it) Therefor an IDE shouldprovide tools to keep the Ignorance Time as well as theoverhead produced by switching between different modesas low as possible

When Bret Victor gave his speech rdquoInventing on Principlerdquohe was talking about something very similar Direct delay-free feedback To show what he was talking about Victorpresented an editor that continuously executes the code assoon as it has been entered and not only shows the actual

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 10: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

2 1 Introduction

result but also reveals information about its internal behav-ior Figure 11 shows how his interface looks like using theexample of the binary search algorithm

Figure 11 A Live Coding Editor presented by Bret Vic-tor [Victor]

On the left-hand side the code can be changed as usual Atthe top of the right-hand side exemplary parameters canbe entered for which the code then gets executed Furtherdown the values for each variable in each iteration andthe return value are shown For example in the line varBret Victor presented

an editor that givesimmediate

information about aprogramrsquos internal

behavior

high = arraylength - 1 the editor says high =5 This way the programmer does not have to imaginewhat happens between executing the code with exemplaryparameters and getting back the according result but canactually see it As mentioned above this happens as thecode gets edited so there is no delay between a change andseeing its impact on the code Figure 12 shows an exampleof how helpful this can be The key the algorithm searchesfor has been changed to rsquogrsquo which is not in the arraySince the invariant of the loop is 1 the algorithm can onlyterminate when the key has been found in the array In thiscase however the loop iterates endlessly The solution isto replace the invariant by low le high A solution thatis obvious once the programmer sees that in the fourthiteration low gets greater than high and nothing elsechanges anymore

The direct immediate feedback Victor was talking about isthe basic idea of my thesis Since he designed his editor

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 11: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

3

Figure 12 The key parameter has been changed

only to show what he was talking about in his speech hisprototype is very low fidelity and has some basic usabil-ity flaws (most importantly Only 5 iterations can be dis-played and the functionality is very limited) Furthermorehis editor has never been evaluated neither in a qualitativeuser study nor in a quantitative one My thesis will pickup his idea develop it and finally evaluate it To do sotwo iterations of the DIA-cycle (that is Design-Implement-Analyze) have been done We started with a literature re-search (see Chapter 2) to get an overview what research hasalready been done with regards to Live Coding We then Overview of the

thesisimplemented a low fidelity prototype and evaluated it in aqualitative user study to find out which basic interface de-sign provides as much analogy between the code and itsevaluation (see Chapter 3) Afterwards we implemented asoftware-prototype (see Chapter 4) to conduct the secondquantitative user study which should reveal if the interfaceenhances the work flow significantly (see Chapter 5)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 12: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

5

Chapter 2

Related Work

The idea of maintaining a continuous connection betweencode and output is not not new Already back in 1985 The idea of

continuouslyexecuting code hasalready beenpresented in 1985

[Henderson and Weiser 1985] stated that the classicalEdit-(Compile-)Execute cycle where program editing andexecution are independent activities is not optimal Theystated that the relation between input and output shouldcontinuously be kept consistent

[Pane and Myers 1996] proposed that a system shouldsupport incremental testing and feedback and help detect Incremental testing

and feedbackimprove productivity

diagnose and recover from errors Although they wererecommending heuristics to improve novice programmingsystems these advices also help improving the productiv-ity of experts

This concurs with [Ko et al 2004] who reported on learn-ing barriers in programming systems Among others they 2 learning barriers

UnderstandingBarriers andInformation Barriers

reported on the so called Understanding Barriers and Infor-mation Barriers Understanding Barriers happen when codebehaves differently from what the user expected If forexample there is a compile-time error and the user cannotextract which part of the code is deemed wrong by the com-piler from the error message then this is an UnderstandingBarrier In the user study Ko et al conducted most (34 of38) of these Understanding Barriers were insurmountable

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 13: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

6 2 Related Work

Information Barriers are elements in the environmentthat make the internal behavior of the program hard tounderstand They happen when a user builds a hypothesisabout the internal behavior of the program but cannot use(or find) a tool the environment provides to verify thathypothesis In Ko et alrsquos user study most of these under-standing barriers (10 of 14) were insurmountable To helpthe programmer an IDE should reveal what a programdid or did not do at compile or runtime and in addition itshould be possible to inspect a programrsquos internal behavior

Although the idea of helping the programmer under-stand the internal behavior of a program by example isnot new current environments only offer little help forCurrent

environments offerlittle help for

understanding andadapting examples

understanding and adapting examples Therefore [Brandtet al] have implemented Rehearse an editor that providesimmediate evaluation and infinite undo of executionA screenshot of this editor can be seen in Figure 21Although Rehearse embodies the immediate feedback

Figure 21 rdquoA screenshot of the Rehearse editor (1) Thefunction declaration parameter names and current val-ues (2) a statement that has been executed and (3) the re-sult of execution (4) an undone statement (5) the currentlinerdquo [Brandt et al]

and the maximal illumination we are striving for thereare issues with it First Rehearse disturbs the user inhis usual programming practices For example it is notAdvantages and

disadvantages ofRehearse

possible to edit a function in the usual editor while it isopen in Rehearse This means that there are two ways of

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 14: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

7

editing a function (either the usual way with a standardeditor or interactively with Rehearse) and they are mutualexclusive (this issue has also been reported as a resultof the user study Brandt et al conducted) We think itis best to integrate the interactive functionality into thestandard one so that the user does not again have toswitch between different modes Additionally Rehearseprovides functionality to help the programmer explorealternative paths by undoing arbitrarily many executedlines while the goal of our editor is to show what hap-pens in the code with exemplary parameters to help theprogrammer understand what internal behavior the codehas This way the programmer can work uninterruptedand still get information about the behavior of his program

Similarly [Edwards 2004] stated that it improves the workflow if the programmer does not have to leave the environ-ment to see what result the execution of an example yieldsHe says the best way to understand code is by thinkingof concrete examples To solve the problem that the pro- Examples help

understanding codegrammer has to emulate the behavior of the program in hishead as much code as possible should be illuminated withexamples Edwards built a plug-in for Eclipse for whicha screenshot can be seen in Figure 22 The results on the

Figure 22 A screenshot of the plug-in presented by [Ed-wards 2004]

left-hand side of the interface get updated as soon as thereis a change in the code From a theoretical point of viewthis is exactly what we intend to do The code is continu-

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 15: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

8 2 Related Work

ously executed for exemplary parameters while the user isworking on it From a practical point of view though thisplug-in has several drawbacks First the analogy betweenIdea and drawbacks

of Edwardsrsquo plug-in code and evaluation is not very strong (eg not as strongas the one in Rehearse) so it is not as easy for the userto follow the control flow of the program as it should beThere are further usability issues Skipped code is greyedout This is a good idea since the user can see which partof the code has been executed and which part has not Theproblem is to see for which parameters the code has beenskipped In Figure 22 for example fact has been calledwith the values 1 and 3 and in addition subsequentlywith the parameters 0 and 2 (resulting from the recursivecall of fact in the else statement) For which of theparameters the else statement has been skipped is not asobvious as it could be Furthermore it is suboptimal tocall the function with exemplary parameters in the codeitself This way a programmer cannot write code as usualand just extract the additional information but has to insertrdquoforeignrdquo code to be able to see what happens inside thecode and then later has to remove the additional codeagain because it has nothing to do with the functionalitythe programmer wanted to write in the first place FinallyEdwards presented the plug-in without evaluation of anykind so a remaining task is to find out if or how much thiskind of feedback improves the work flow (see Chapter 5)

In [Boehm 1979] it was reported that modifying softwarebasically consists of three phasesHow modifying

software works

bull Understanding the existing software

bull Modifying it

bull Revalidating the modified version

We believe that our interface helps improve (thereby speedup) this whole process (our second user studyrsquos goal wasto find out whether this is true see Chapter 5) Thissupposition is supported by [Erdogmus et al 2005] whereit is said that instant feedback whether new functionality

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 16: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

9

has been implemented properly helps reducing wasteddevelopment time

Finally our approach concurs with the idea of Opportunis-tic Programming [Brandt et al 2008] [Brandt et al 2009a][Brandt et al 2009b] Brandt et al say that frequent itera-tion is a necessary part of learning and understanding un-familiar code Therefore an opportunistic programmer will Opportunistic

programmers choosetools that speed upiteration

select a tool that speeds up iteration They also state that theEdit-Debug cycles in opportunistic programming are veryshort About 50 of these cycles are shorter than 30 sec-onds and even 80 are shorter than 5 minutes Our toolwould help reducing the resulting overhead since the out-come is instantly visible without having to change modesor compile and execute

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 17: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

11

Chapter 3

First User Study

31 The Prototypes

The main goal of our first user study was to find out whichbasic interface design is the optimal one especially with Goal of the first user

studyregards to the link between the code and the evaluatedversion of the code It should work in a way such that theprogrammer does not get disturbed in his usual work butthat the interface still provides as much information asaccessible as possible

To find out how such an interface would look like we im-plemented low fidelity prototypes for 5 different interfacesin Ruby Basically these prototypes consisted of pictures 5 low fidelity

prototypes for Rubyhave beenimplemented

that show how the interface would look like for some ex-emplary code with exemplary parameters Each interfacehas been illustrated using 5 pieces of code as example (Fig-ures 31-35) Note The font used in all prototypes was sup-posed to emphasize that these prototypes are low fidelityso that the users had not inhibitions telling us fundamentalflaws in the interface design

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 18: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

12 3 First User Study

311 Interface Version 1

Figure 31 shows the first interface with an implementationof the binary search algorithm In this version of theinterface an evaluated version of the code is inserted in anadditional section in the right half of the window At thetop the exemplary parameters can be manipulated Thenin the line high=arraylength-1 the first evaluationtakes place On the right-hand side of the window theright-hand side of the assignment has been replaced bythe according evaluation 5 Afterwards all values are re-Interface variant 1

divides theenvironment into twoThe left-hand side is

a usual editor andthe right-hand side isthe evaluated version

of the code

placed in a similar way That is in simple assignments theright-hand side is substituted completely and in invariantsthe condition has been computed up to the penultimatestep This way eg while(low lt= high) gets evaluatedto while(0 lt= 5) instead of while(true)

EXCURSUSWe also thought about showing what would happen in-side an if statement even if it fails so that the user couldsee what the code could do alternatively We dischargedthis idea however because of two problems Firstly wesuspected that this might quickly become a confusing ex-perience for the user Secondly (and most importantly)the Halting Problem would have to be decidable Sup-pose the following code had to be evaluatedif(xgt0)while(x=0)x--else while(x=0)x++Then the interface would fall into an infinite loop (for anyvalue of x except 0) although the code would not Forx=1 for example the actual code would only make oneiteration in the first loop The interface however wouldtry to calculate the values of the second loop too

Further down in Figure 31(a) there is the following ele-ment

1This element will subsequently be called iterator Theiterator indicates which iteration of a loop is being shownPressing the right arrow increments the shown iteration(Figure 31(b)) pressing the left arrow decrements itaccordingly Furthermore the code for the elsif and theelse statements is not visible on the right-hand side of the

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 19: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

31 The Prototypes 13

window The reason is that the first if statement holdsand therefore the following ones are not executed InFigure 31(b) the first if statement failed (greyed out) thesecond one holds and the third one is hidden again Thisway all code that gets executed is shown (corresponding toJonathan Edwards who said that as much code as possibleshould be rdquoilluminatedrdquo[Edwards 2004])

312 Interface Version 2

Figure 32 shows the second version of the interface withan in place implementation of the selection sort algorithm(note that each interface variant has been implemented Interface variant 2

copies a nothighlightedevaluated version ofthe code to the rightpart of the window

with each algorithm) The differences to the previous in-terface variant are

bull The parameter names are red to signal manipulability

bull No syntax highlighting

bull Evaluated code has no indents and is right aligned

bull No dividing line

EXCURSUSNote that the iterator in this interface looks like follows

i=1This design reveals what variable the iterator refers toThis kind of iterator does not depend on the type of in-terface but on the type of loop In for and upto loopsthis kind of iterator has been used since it is possible tocorrelate the values of one variable and the iterations ofa loop In the later prototypes however we dischargedthis idea again because it only was this easy to correlatea variable and the iterations of one loop because the ex-amples were quite simple In realistic situations this cor-relation will oftentimes not be possible

This version was supposed to take up as little space aspossible while still having a complete (evaluated) versionof the code in a different place

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 20: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

14 3 First User Study

313 Interface Version 3

Figure 33 shows version 3 of the interface with an im-plementation of Bubblesort Small pop-ups indicate theInterface variant 3

uses small pop-upsthat indicate the

value of a statement

value of all statements that are evaluable For examplethe (manipulable and therefore red) value of array is[1423716] and the value of arraylength-2 is 2There are no pop-ups in the code when it is not executedand failed invariants are greyed out The iterator workslike the one in the example for version 2 of the interface butis now inserted into the code This interfacersquos purpose wasto consume as little space as possible while still providingall possible evaluations

314 Interface Version 4

Figure 34 shows interface version 4 with an algorithm thatconverts the fractional part of a decimal number into its bi-Interface variant 4

puts the evaluatedversion of a line

between that line andthe next one

nary representation and outputs the result to the consoleThis interface variant slots the evaluated version of a lineof code right between that line and the next one The nameof manipulable parameters is highlighted in red and thereis no syntax highlighting Failed invariants are greyed outand not executed code is not displayed as usual The goalof this interface version was to find out if the link betweencode and evaluation becomes clearer when the evaluationis very close and very similar to the code it results from

315 Interface Version 5

The last variant of the interface can be seen in Figure 35with an algorithm that checks whether a given number nInterface variant 5

indicates the value ofa statement by

writing it underneaththat statement

without keepingkeywords

is a prime number or not It is similar to the previous onebut only shows rdquorelevantrdquo information This includes right-hand sides of equations and for conditions the actual in-variant but excludes left-hand sides of equations and key-words like else or while Apart from this version 4 and5 are exactly the same (parameters in red no syntax high-lighting not executed code is not shown failed invariants

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 21: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

32 Course of the User Study 15

are greyed out) We developed this variant because in ver-sion 4 the user might confuse actual code with evaluatedcode We wanted to check if a variant without replicatingcontrol structures is better (less confusing)

32 Course of the User Study

The first user study was a qualitative user study It wassupposed to reveal what basic interface design is preferredby users and where there might be flaws in the interfacesDifferent techniques for qualitative user studies have beenused in particular the Model Extraction method and theThink Aloud method The basic structure was the following

bull First the experimenter introduced himself and theproject He assured the participant that he would Introduction to the

user studystay anonymous and that he could ask questions ormake comments whenever he likes To make sure theModel Extraction worked fine the part concerningwhat the thesis and the user study where about wasvery firm The solution we have come up with wasto explain the general idea of Live Coding (that weare designing a kind of editor that takes exemplaryparameters for a method the programmer works onand then evaluates it continuously while changes arebeing made) without revealing anything further likerdquoand then there is going to be a button with whichyou can flick from iteration to iterationrdquo

bull Afterwards one version of the interface was pre-sented to the user together with the following ques- Model Extractiontions rdquoWhat do you think you see hererdquo and rdquoWhatdo you think you can do with itrdquo

bull After the user told everything he or she knew or if theuser had a gulf of understanding [Norman 2002] the Explanation of the

interfaceparts of the interface which the user did not under-stand were explained and we went through all itera-tions once to make clear how the interface works

bull The next goal was to find out if the user had prob- Suggestions forimprovementlems in understanding elements of the interface if he

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 22: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

16 3 First User Study

thinks that there are things that can be done better etcetera

bull Lastly the other interface variants were shown to theuser with the same piece of code and it was explainedGoing through the

remaining interfacevariants

how they worked how they differ from the previousones and we went through all iterations once so thatthe user could get a feeling for the interface Aftereach variant the user was asked what he thought ofthe interface (and why) and what he thought was bet-ter or worse compared to the previous ones

Whenever the user had a question or there was an inter-esting topic that lead to more interesting information thispoint was discussed in more depth

33 Results

The user study was conducted with 6 different users All ofthem studied Computer Science were in the sixth semesterand had between 4 and 10 years of programming experi-ence Only one of them had worked with Ruby before Allusers took part voluntarily and were not rewarded in anyway The conditions were randomized and looked like fol-lows

User ID Algorithm Order of versions1 Bubblesort 312452 Selection sort 453213 Fractional part10 rarr Fractional part2 253414 Prime number test 135425 Binary search 524316 Binary search 34152

This means that for example user 1 saw interface version3 first and then versions 1245 in that order In all variantshe worked with the Bubblesort algorithm

The user study yielded two major results

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 23: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

33 Results 17

bull Most users liked version 1 better than the other ones

bull Not a single user understood what the iterator is goodfor or what it might do

In more detail Three users (users 1 2 and 4) stated clearlythat they liked version 1 the most User 3 was not quite 3 users liked version

1 the most 1 userliked version 4 themost 1 user couldnot decide betweenversion 1 and 2 and1 user could notdecide betweenversion 1 and 3

sure whether he liked version 1 or version 2 more Hestated that a mix between the two variants where thereis no syntax highlighting would be the best because hegot confused the first time he saw version 1 and did notdirectly see that the code on the right-hand side of thewindow is not actually code but the evaluation User 6 wasnot sure if he preferred version 1 or version 3 He statedthat if there is enough space on the screen version 1 isbetter but with all the other elements of a typical IDE aline of code might oftentimes be too long so that there is noplace to show the line itself as well as its evaluated version(a concern that user 3 communicated as well) Withoutthis problem though user 6 said that version 3 is worsebecause the interface might get messy and distractingquickly because of all the pop-ups right next to the codeIn addition he said with more complex code it might beproblematic to see which pop-up refers to which part ofthe code Finally user 5 stated that he likes version 4 themost The reason was that he did not have to look to theright if he wanted to see the value of a variable and thensearch it there but that in version 5 the information wasexactly at its source He only said that this version mightbe problematic to work with since the reading flow is influ-enced negatively and he feared that he would loose muchspace on the screen and would therefore have to scroll a lot

Each user stated that the main benefit of version 1 and2 was that the actual code remained unaffected Thiscorresponds to what [DeLine et al 2006] reported on theperceptual model a programmer builds of the code When Perceptual model of

the code shouldremain unaffected

navigating through source code a programmer uses aperceptual model to do so This way the programmer doesnot need to read the code (cognitive) to know which part heis seeing but can do so by looking at its rdquovisual landmarksrdquo(perceptual) While version 3 affects this perceptual modelnegatively but does not break it version 4 and 5 have the

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 24: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

18 3 First User Study

disadvantage to do so

To most of the users the basic structure of the interfaceswas clear after only a few seconds Only two of themconfused actual code with the evaluation for a short timeUser 5 when he saw version 5 the first time and user 4when he saw version 1 the first time However neither ofthem needed help or much time to figure how the interfaceworked after they started to read more precisely Eachuserrsquos first question or comment concerned the font andhow awful they (every one) thought it was After a briefexplanation why we chose this particular font each usergot used to it quickly

No user understood the iterator None of them got theconnection to the while upto or for next to the iteratorIterator is not self

explaining After the explanation what the iterator does some userssaid they thought the left and right buttons were rdquoltrdquo andrdquogtrdquo signs

The following tables summarize the pros and contras foreach interface

Version 1Pros Contrasmiddot The control structure (in-dents) is still presentmiddot Because of the indentsand the syntax highlightingthere is a strong analogy be-tween the code and its eval-uated versionmiddot There is a clear separationbetween code and evalua-tionmiddot The editor remains un-changed

middot Needs much spacemiddot User has to look from theleft to the right and viceversa to see two pieces thatare correlated In otherwords a line of code and itsevaluated version are too faraway from each other

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 25: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

33 Results 19

Version 2Pros Contrasmiddot The editor remains un-changedmiddot Code and evaluation do notget confused because they donot look the samemiddot Needs less space than ver-sion 1

middot Less analogy than in ver-sion 1middot User needs to look fromthe left to the right and viceversa This is being compli-cated by the fact that with-out indents and syntax high-lighting it is even more diffi-cult to find the point of inter-estmiddot Needs a lot of space

Version 3Pros ContrasmiddotNeeds minimalistic amountof spacemiddot The code structure itself re-mains untouched

middot Oftentimes not clear whichpop-up refers to which partof the codemiddot Can get messy quickly

Version 4Pros Contrasmiddot Strong analogymiddot Better for lines of code thatare too long for version 1 and2middot Evaluations right next totheir source

middot Code and evaluation getconfused quickly especiallyusual reading might be prob-lematic since one has to skipevery second line dependingon what he wants to readmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 26: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

20 3 First User Study

Version 5Pros Contrasmiddot Evaluations right next totheir sourcemiddot Better for lines of code thatare too long for version 1 and2

middot Oftentimes it is not clearwhat a value refers tomiddot Still distracting during codereadingmiddot Perceptual model gets de-stroyedmiddot Might be demanding tonavigate when already longdocuments get double asmuch lines

331 Further Comments and Suggestions for Im-provement

One of the users pointed out that in version 3 the iteratorhas been inserted into the code and therefore it now isInterface version 3

can producesyntactically wrong

code

syntactically wrong (for i in 14 becomes for i=2in 14 for the second iteration for example)

Two users suggested that for version 4 and 5 curly bracescould indicate to which part of the code the evaluationrefers

Some users suggested to make the number on the iteratordirectly manipulable so that if one would want to go toSuggestions for

improving the iterator iteration 42 one would not have to press 41 times a buttonbut could just enter 42 In addition some users said theywanted a button for jumping to the first and a button forjumping to the last iteration Two users said it would begood if there is something that indicates directly that theiterator actually shows iterations like the term rdquoIteration rdquoor anything similar

We noticed that the manipulable parameters and theevaluation results have to be distinguishable more easily

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 27: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

33 Results 21

One user proposed that if the cursor moves over somevariable in the evaluation part a small pop-up couldindicate to which variable a value belongs

One user said that it might be helpful that for biggerboolean expressions that fail it would be visible which partof the expression is responsible for the failure

332 Conclusions

With large screens (24rdquo and more) becoming more andmore common in modern work environments of program- Version 1 is the

favored onemers we can override the concerns about loosing too muchspace with version 1 which is the reason why it is the clearfavorite Figure 36 shows the basic interface after the firstuser study

The iterator has been improved in several ways

bull New buttons for jumping to the first and the last iter-ation have been added

bull There is a text field with which the iteration can bemanipulated directly

bull A tooltip now indicates the variable to which avalue belongs when hovering over the value with themouse

bull The label Iteration has been added at the top of theiterator

The way of manipulating the parameters has been changedThe function header now also is copied to the right side A new way of

manipulating theparameters

but the names of the parameters are replaced with a textfield in which the according value can be entered This wayboth the analogy to the code and the separation betweenparameters and evaluation are being enhanced

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 28: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

22 3 First User Study

(a) First iteration

(b) Second iteration

Figure 31 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 29: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

33 Results 23

Figure 32 Version 2 The evaluated version of the code ison the right-hand side but is ranged right

Figure 33 Version 3 Small pop-ups indicate the value of avariable

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 30: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

24 3 First User Study

Figure 34 Version 4 The evaluated version of a line isinserted right behind the line itself

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 31: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

33 Results 25

Figure 35 Version 5 Only the values of the variables arein between the lines

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 32: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

26 3 First User Study

Figure 36 Interface after the first user study

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 33: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

27

Chapter 4

Implementation

After the first user study had been finished and the resultshad been analyzed a software prototype with sufficient A software prototype

had to beimplemented

functionality had to be implemented to conduct the seconduser study Hence we wrote an extension for Brackets1 anopen source editor that has recently been released officially

The language we wrote the extension for was JavaScriptWhat we needed was an extension that could calculate thevalues of the variables that had to been substituted

41 First Approach (discharged)

The initial idea was to write an extension that connects viaV8-Node2 (an extension that maintains a connection to a Initial idea Use

V8-Node an Nodejsto get the locals

Nodejs3 V8 debugger) to an external debugger and lets itevaluate the code Our extension was supposed to supplythe actual code to the debugger and set a breakpoint in thefirst line of actual code Then it read the locals (that is thelocal values for all active variables) and stepped over to the

1httpsgithubcomadobebrackets2httpsgithubcomDennisKehrigbrackets-v8-node-

livetreenodeChildProcess3httpnodejsorg

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 34: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

28 4 Implementation

next line After all locals had been collected the extensionshould substitute all necessary values There were prob-lems with this approach Firstly the extension is not stableThis extension was

not usable for a userstudy

enough to be usable in a user study Secondly it is too slowIt needs more than half a second for one evaluation Thisis way too much since we wanted something that couldevaluate the code as the user edits it without any furtherdelay (hence the name rdquoLive Codingrdquo) With this extensionit would only have been possible to update the evaluationusing a special shortcut or to attach this function to anothershortcut like the Ctrl+S shortcut for saving Since thiswas by far not the optimal way we decided to start all overagain

42 Basic Evaluation Technique

In the second approach we did not try to externalize thescript execution but instead made vast use of JavaScriptrsquosbuilt-in eval function The eval function takes a StringSecond approach

using eval toevaluate dynamicallygenerated code thatreturns the value of

interest

as parameter If this String contains valid JavaScript codeeval will evaluate the code and return the last executedstatement eval(var i=1i=i+5) will return 6 forexample The extension basically works as follows Aloop that runs over all DOM-elements in the highlightedcode searches for substitutable variables While searchingfor these elements the loop captures in which contextthe actual DOM-element is The three basic contexts aresimple substitutions substitutions within one loop andsubstitutions within a nested loop (why this is will beexplained below) But there are also other things the loopchecks for For simple assignments like i=x+y x andy have to be substituted while i does not In lines wherea loop begins the iterator has to be appended invariantshave to be treated differently et cetera

Note that right-hand sides of assignments are handledslightly different from what has been presented in theprototypes of the first user study They do not get evalu-ated to the last step and invariants do not get evaluated

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 35: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

43 How Values Are Being Calculated 29

to the penultimate step anymore Both of them get onlyevaluated one step For i=1 and z=2 the right-hand sideof x=i+z does not get evaluated to 3 but to 1+2 Wedecided to do it this way because some users in the firstuser study stated that they did not only want to see whatthe result of an operation is but also how it got there

When a substitutable variable has been found its value willbe computed For that the eval function is used (see sec-tion 43 for details) and the value will be inserted Thiswhole process takes place each time there is a change in thecode or if one of the parameters gets changed If another it- How the extension

reacts to changeseration of a loop that is being shown is selected all valuesinside that loop will be calculated again

43 How Values Are Being Calculated

Depending on the context of the variable a piece ofcode will be generated (and then evaluated) that whenevaluated with eval returns the value of the variable ofinterest For simple substitutions outside of any loops thisworks like follows

Suppose the following code had to be evaluated for x=0

function foo(x)var i=0var j=x+1

Then x in the third line would have to be evaluated To do How the values arecalculated for simplesubstitutions

so the following String would be generatedvar x=0var i=0xSo for simple substitutions the whole code preceding thevariable that has to be substituted is copied and each pa-rameter is initialized as var with its exemplary value at the

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 36: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

30 4 Implementation

beginning

For simple loops this method has to be extended First ithas to be found out how often the loop is executed ThisSubstitutions within

one loop serves two purposes First the iterator now shows thenumber of iterations of the loop Second we have to checkif the user wants to see an iteration that does not exist Thisis being done by introducing a counter that is incrementedin each iteration of the loop and then returned when theloop terminated The following example shows how thevalue of a variable within a loop is being calculated

function test()var i=0var j=0while (ilt5)

i++j=j+2

Suppose we want to obtain the value for j (highlightedin red) in the second iteration of the loop Then theevaluation basically works as follows

var myCounter=0var myResultvar i=0var j=0while (ilt5)

myCounter++i++if(myCounter==2)myResult=j breakj=j+2

myResult

A counter gets inserted to check which iteration the loop is

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 37: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

44 Problems 31

in If it is in the right iteration the assignment myResult=jtakes place so that eval can return the according valueFor a loop within a loop two counters have to be inserted Other casesand both of them have to be checked accordingly Thereare further variants of contexts like the different parts of afor(firstPartsecondPartthirdPart) statementThe first part gets only executed once at the beginningwhile the second part gets evaluated one time more thanthe rest of the loop (the third part does not get evaluated)For all cases the dynamic generation of the code that re-turns the value of interest works similar to the techniquespresented

Obviously this is a brute-force approach For each valuethat has to be calculated all preceding code has to be eval-uated We used this approach anyway since we spent quite Summary Very

inefficient limitedfunctionality stillcompletely sufficientfor the user study

some time developing the first version of the extension (us-ing V8-Node) and were under considerable time pressureThe technique we chose is simple and fast to develop Fur-thermore the prototype only had to work for the seconduser study which is the reason why the limited function-ality (loop nesting depth of 2 for example) was sufficientAnd finally since the tasks in the second user study werenot very expensive with regards to calculation costs eventhis very inefficient version of the extension ran live with-out any delays Some tests showed that it would even havebeen quick enough to operate with hundreds of iterationsmore without any delay and with some few thousand iter-ations more without dispatching the interface

44 Problems

The biggest problem we encountered were infinite loopsBecause the interface runs completely live and tries to The Halting problemevaluate the whole code while the user is editing it infiniteloops may occur while the user is typing Additionallyit might of course be possible that there actually is aninfinite loop even though the user thinks the code iscorrect Without modification the interface crashed in

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 38: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

32 4 Implementation

either of these cases To prevent this from happeningtwo minor modifications had to be done Firstly if wewant to calculate the number of iterations a loop doeswe check whether the counter we use for that calculationis still below a certain limit (we used 999 here becauseit proved to be sufficient and efficient enough to runlive) Now the Halting Problem for this particular loop isapproximated The problem of the infinite loop still existsthough Suppose we have the following code

var x=0var i=0while(x=10)

i++x=i

We now know that the loop does more then 999 iterationsso we do not try to calculate the value for x in the invariantSubstitutions after

infinite loops or for i within the loop for any further iterations The onlyproblem that remains is that we still try to get the value fori in the last line When we try to calculate this value wedo not yet know that there is an infinite loop somewhereabove To find this out a new global counter gets insertedand limited to 5000 iterations

var globalCounter=0var x=0var i=0while(x=10)

if(globalCounter++ gt= 5000)breaki++

i

This way we could keep the incrementation and the checkwhether there are more then 5000 iterations within onestatement (two statements are too many since after a

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 39: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

44 Problems 33

while without curly braces only the first statement wouldbe executed) The incrementation and test get inserted intoevery loop ahead of the position we are checking for Itwould of course have been possible to set a flag if there isan infinite loop somewhere ahead but this way we alsohad the possibility to limit the overall number of iterationsand still get a value The second user study showed that Flaw in the

implementationthis was the wrong approach A user introduced an infiniteloop in a sorting algorithm The condition for the loop wasnever violated although the array was sorted With thetechnique of assigning a value to a variable although thelimitation of iterations has been hit the sorted array wasshown in the return statement Since the user only paidattention to the output he did not see that the number ofiterations of the loop was 999 and thought that the code iscorrect

Another minor problem was that of course while the useris typing the extension tries to evaluate code that is syntac- Suppress errors so

that user does notget disturbed

tically not correct The solution is to set windowonerrorto a function that always returns true This way errorsget just suppressed and the execution stops If there is anew change in the editor the execution will start again andwork exactly the same way so that working code always isexecuted and not working code is not

The last problem concerned hiding code that does notget evaluated For loops this could easily be done If Hiding code within

failed if statementsthe loop is in its very last iteration (the invariant fails)the rest of the loop has to be hidden otherwise it doesnot Finding out if an if statement fails (and hence theaccording code has to be hidden) however was moredifficult because we had not enough information to be ableto do so (remember we only compute the first step whichincludes nothing but substituting all variables) To find outif an if statement holds or not we do the following allvariables in the invariant get substituted and then we leteval evaluate just the invariant For if(xlty) with x=5and y=6 we substitute all values if(5lt6) and then calleval((5lt6))

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 40: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

34 4 Implementation

Figure 41 Screenshot showing Brackets with our extension and a dummy script

Finally Figure 41 shows how the prototype for our seconduser study looks like

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 41: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

35

Chapter 5

Second User Study

51 Course of the User Study

To find out whether our extension brings significantimprovement a second quantitative user study with 20 Hypothesis

Compared to usualdebugging methodsour technique bringssignificantimprovement

participants has been conducted The hypothesis was thatthe debugging speed (in this case the time between the userreading the code the first time and fixing it) compared toa usual debugging engine could significantly be reducedwith our extension If this hypothesis would be supportedby the results of the user study this would indicate that theunderstanding speed of foreign code would be increasedby our method

No user had worked with Brackets or JavaScript before(notwithstanding that all of them had worked with Javabefore) Users were Computer Science students with 4-9(average of 58) years of programming experience Onlythree of them had professional experience (2-4 years) Allusers took part voluntarily and were not rewarded in anyway

To test the hypothesis two searching algorithms (animplementation of Shakersort and an implementation ofBubblesort) have been implemented and a small bug has

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 42: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

36 5 Second User Study

been introduced in either of them (see Appendix A) Thetask of the user was to find both bugs - one time with2 algorithms 1 bug

each 2 tasks debugone time with

Bracketsrsquo tools andone time with our

extension

our extension and one time using the traditional way(that is using the console breakpoints variable viewerweb kit inspector and the step over functionality Bracketsprovides) Each user received an intro to the function-ality of both our extension and the Brackets debuggingengine For both a small dummy script has been shownwith which it has been explained how each part of thefunctionality works until the user confirmed that he or sheunderstood every part of the functionality The task forthe user was to find the bug in the code (users have beentold that there is exactly one bug) fix it and give noticewhen he or she thought that the code is correct (task tocondition assignment and task order were counterbalancedto eliminate side effects)

After the two debugging tasks had been accomplishedeach user filled out two versions of the System UsabilitySystem Usability

Scale to find outwhether users liked

working with thesystem

Scale [Brooke 1996] (SUS) one for our extension and onefor the standard debugging Brackets provides The first10 questions are identical to the initial version that hasbeen presented in 1996 except that the word rdquocumbersomerdquoin point 8 has been changed to rdquoawkwardrdquo as suggestedby [Finstad 2006] For the extension the following ques-tions have been added to the questionnaire

bull I found the additional information very helpful formy task

bull I was distracted by the additional information

bull I found it very easy to follow the control flow of theprogram

bull I found complex control structures hard to under-stand

For the debugging using Brackets the following two ques-tions have been added

bull I found it very easy to follow the control flow of theprogram

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 43: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

52 Results 37

bull I found complex control structures hard to under-stand

See Appendix B for a complete version of both question-naires

Finally a short qualitative review has been done with eachuser Topics were not set a priori so the discussion con-cerned mainly subjects we observed during the debuggingtasks (see Section 523)

52 Results

521 Debugging Times

The box plot in Figure 51 represents the time users neededto accomplish the different tasks

Since it is not possible to assume that debugging the twodifferent code examples is equally difficult without loosingthe meaningfulness of the results they have to be analyzedindependently

On average participants were faster debugging Shak-ersort using our extension (M = 40200 SE = 48464) On average users

were faster using ourextension but notsignificantly

than using Brackets (M = 4952 SE = 61896) Thisdifference was not significant with t(9) = 1159 andp gt 005 However it did represent a medium-sizedeffect with r = 036037 The results for Bubblesort areanalogous Using the extension users were were faster(M = 41633 SE = 47885) than without it (M = 47756SE = 41597) The result also is non-significant witht(8) = 1046 and p gt 005 And with r = 034685the effect is medium-sized for Bubblesort as well

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 44: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

38 5 Second User Study

Bubblesort with extension

Bubblesort without extension

Shakersort with extension

Shakersort without extension

seco

nds

800

600

400

200

Figure 51 Box plot for the different task completion times

EXCURSUSAssuming that debugging both algorithms is equally dif-ficult leads to significant results (t(37) = 1705 and p lt005) As mentioned above this is not possible with-out loosing the meaningfulness of the results Howeveralong with the medium-sized effects of both tasks takenindividually this indicates that just continuing the userstudy (with 10 or 20 more users) will lead to significantresults

522 System Usability Scale

According to the SUS-values users liked the Live CodingAccording to theSUS-values users

liked our systembetter than Brackets

extension considerably better (M = 8195 SE = 1804)than the developer tools Brackets provides (M = 5120SE = 2117) This result is significant with t(19) = 15831

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 45: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

52 Results 39

Standard DebuggingLive Debugging

100

8 0

6 0

4 0

SUS-score

Figure 52 Box plot for the SUS-values with and without extension

and p lt 001 This can also be seen in the box plot of theSUS-scores in Figure 52 Finally Figure 53 shows how theusers responded to the additional questions on the ques-tionnaire

523 Qualitative Results

We found two major issues in the qualitative follow-upsurvey Firstly the iterator and parameter fields are too Iterator and

parameter fields aretoo small

small Users had difficulties hitting the intended buttonson the iterator and reading the content of the parameterfields Secondly many users expected the interface to showthe first iteration of a loop first and not the last one This Users expected

interface to show firstiteration first

was particularly observable with nested loops When theouter iteration had been changed most users expected theinterface to first show the first iteration of the inner loopbecause that reflects their way of comprehending the code

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 46: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

40 5 Second User Study

Strongly agreeStrongly disagree

I found the additional informationvery helpful for my task (Live Debugging)

I was distracted by the additional information (Live Debugging)

I found it very easy to follow the control flow of the program (Live Debugging)

I found it very easy to follow the control flow of the program (Standard Debugging)

I found complex control structures hard to understand (Live Debugging)

I found complex control structures hard to understand (Standard Debugging)

Figure 53 Box plot for the additional points on the questionnaire

One user uncovered a flaw in the implementation Asalready explained in section 44 even if an infinite loopoccurs somewhere in the code values are still assigned tothe variables (either after the 999th iteration of one loop orafter there have globally been more than 5000 iterations)Flaw in the

implementationDespite endless loop

values are assigned

In this particular case the algorithm was finished withsorting the array but the invariant was wrong in such away that it did not get violated anyway Neverthelessthe extension assigned a value to the return statementThe user could have seen that there is an infinite loop bylooking at the iterator which showed rdquo999999rdquo after eachedit However since he changed the exemplary input arrayand looked only at the result (which was a correctly sortedarray) he thought that the code was correct although itwas not This means that in a future implementation nosubstitution should take place in unreachable assignments

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 47: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

52 Results 41

Finally many users said it would be good to highlightwhich element of an array is addressed by an index For[45324542753][5] for example users should not haveto count which element is the fifth one but that elementshould modestly be highlighted

524 Discussion

Most users stated that the extension would help improvethe understanding (debugging) process Considering theSystem Usability Scales the additional questions on thequestionnaires and the qualitative statements of the usersusers liked our extension notably better than the devel-oper tools Brackets provides However no significant re- Users liked extension

and said that it wouldhelp Neverthelessno significantimprovement couldbe shown in the userstudy

sult could be obtained in this quantitative user study thatconfirms the initial hypothesis Some users reported thatdespite the introduction to the extension it was not as easyto involve the extension in the understanding process as itwas for the Brackets tools because they were used to theway the traditional tools work and not to the way the ex-tension works It is quite possible that just continuing theuser study or refining the prototype and the user study willboth lead to significant results The user study presentedin this chapter however does not provide any conclusivemanifestations concerning improvements in debugging orunderstanding speeds

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 48: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

43

Chapter 6

Summary and futurework

61 Summary

Live Coding enables the programmer to illuminate codeduring editing it in order to understand its internal behav-ior A user study with low fidelity prototypes has been con-ducted to find out how the basic interconnection betweencode and a continuously updated evaluated version of thecode can best be displayed A software prototype of the re-sult an evaluated copy of the code with syntax highlight-ing and indents in a separate section of the IDE has beenimplemented in form of an extension for Brackets It hasbeen explained how this prototype essentially works whatdrawbacks it has and which problems occurred during theimplementation With this prototype a second user studyhas been conducted which should reveal if there is a signif-icant enhancement in the debugging speed of experiencedprogrammers Although this user study only yielded non-significant results there is evidence that significant resultscan be obtained by continuing the user study

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 49: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

44 6 Summary and future work

62 Future Work

The quantitative user study only yielded non-significantresults but there is evidence that a significant result can2 possible

approaches tosignificant results

be obtained (see Chapter 5) Therefore two possibilitiesof continuation exist Firstly it is possible to refine thesoftware-prototype then to design and conduct a furtheruser study Secondly it is possible that simply continuingthe user study presented in Chapter 5 with another 10 or20 users will already yield a significant result

The biggest issues for a continuation of the design of theinterface are caused by complex data structures Firstlycomplex data structures have to be displayed In theprototypes presented in this paper only simple data types(Strings Arrays Integers) have been displayed Ifhowever the interface is supposed to be used on a day-to-day basis complex data structures (eg DOM-elementsor self-defined structures) have to be representable Itwould be conceivable to use a representation similar tothe one Bracketrsquos web kit inspector uses Figure 61(a)shows how this looks likes for the DOM-element this IfComplex data

structures have to bedisplayed

the user presses the small arrow the element rdquounfoldsrdquoand its attributes can be inspected (Figure 61(b)) Thisapproach however has the drawback that information isnot quickly available and therefore another approach thatconcentrates on instantly showing relevant informationmight be interesting

Furthermore complex data structures do not only haveto be displayed but also to be entered An approach thatand to be enteredmight be a solution for this problem has been presentedby [Edwards 2004] When a program calls an externalfunction in the IDE presented by Edwards only the fact oftheir execution and their return value are traced Similarlyit might be possible for our environment to rememberwith what parameters the function that has to be evaluatedhas been called for an exemplary execution of the contextthe function is used in This way not only does the usersave time by not having to enter the example but also anexample is available that indeed occurs

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 50: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

62 Future Work 45

(a) DOM-element folded (b) DOM-element unfolded

Figure 61 Version 1 The evaluated version of the code ison the right-hand side and keeps its structure and syntaxhighlighting

Another aspect that might be a sensible enhancement isthe incorporation of the Test-Driven Development [Williams Potential field of

researchcongruence withTDD and ContinuousTesting

et al 2003][Beck 2002] [Saff and Ernst 2004a] and theContinuous Testing approach [Saff and Ernst 2004b] Theidea of the Test-Driven Development is to first write testcases and then implement the according production codeAlthough Continuous Testing is predominantly used forlarge scale systems there is common ground between Con-tinuous Testing and the Test-Driven Development There-fore checking how our approach can be integrated into andcombined with these techniques is a promising field of re-search

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 51: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

47

Appendix A

Code for the SecondUser Study

function sort(inputArray)var array=inputArrayvar n=arraylengthvar newnvar swapwhile(ngt1)

newn=1for(var i=0 iltn-1i++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapnewn=i+1

n=newn

return array

The Bubblesort implementation (red highlighted +1 has been left out)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 52: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

48 A Code for the Second User Study

function sort(inputArray)

var array=inputArrayvar begin=-1var end=arraylength-2var swapped=truevar swap

while(swapped)swapped=falsebegin++for(var i=begin ilt=endi++)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

end--for(var i = end igt=begini--)

if(array[i]gtarray[i+1])

swap=array[i]array[i]=array[i+1]array[i+1]=swapswapped=true

return array

The Shakersort implementation (red highlighted -1 has been changed to 0)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 53: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

49

Appendix B

System Usability Scales

On the following two pages the System Usability Scalequestionnaires used in our second user study can be seen

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 54: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

50 B System Usability Scales

System Usability Scale - Standard Debugging

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found it very easy to follow the control flow of the program

12 I found complex control structures hard to understand

Figure B1 System Usability Scale questionnaire for debugging using Brackets

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 55: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

51

System Usability Scale - Live Coding Extension

Strongly Disagree

Strongly Agree

1 I think that I would like to use this system frequently

2 I found the system unnecessarily complex

3 I thought the system was easy to use

4 I think that I would need the support of a technical person to be able to use this system

5 I found the various functions in this system were well integrated

6 I thought there was too much inconsistency in this system

7 I would imagine that most people would learn to use this system very quickly

8 I found the system very awkward to use

9 I felt very confident using the system

10 I needed to learn a lot of things before I could get going with this system

11 I found the additional information very helpful for my task

12 I was distracted by the additional information

13 I found it very easy to follow the control flow of the program

14 I found complex control structures hard to understand

Figure B2 System Usability Scale questionnaire for debugging using the Live Cod-ing Extension

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 56: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

53

Bibliography

Kent Beck Test Driven Development By Example (Addison-Wesley Signature) Addison-Wesley Longman Amster-dam 2002 ISBN 0321146530

B W Boehm Classics in software engineering chap-ter Software engineering pages 323ndash361 Yourdon PressUpper Saddle River NJ USA 1979 ISBN 0-917072-14-6 URL httpdlacmorgcitationcfmid=12415151241536

Joel Brandt Vignan Pattamatta William Choi Ben Hsiehand Scott R Klemmer Rehearse Helping programmersadapt examples by visualizing execution and highlight-ing related code

Joel Brandt Philip J Guo Joel Lewenstein and Scott RKlemmer Opportunistic programming how rapidideation and prototyping occur in practice In Proceed-ings of the 4th international workshop on End-user soft-ware engineering WEUSE rsquo08 pages 1ndash5 New York NYUSA 2008 ACM ISBN 978-1-60558-034-0 doi 10114513708471370848 URL httpdoiacmorg10114513708471370848

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Opportunistic pro-gramming Writing code to prototype ideate and dis-cover IEEE Softw 26(5)18ndash24 September 2009a ISSN0740-7459 doi 101109MS2009147 URL httpdxdoiorg101109MS2009147

Joel Brandt Philip J Guo Joel Lewenstein MiraDontcheva and Scott R Klemmer Two studies ofopportunistic programming interleaving web forag-ing learning and writing code In Proceedings of the

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 57: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

54 Bibliography

27th international conference on Human factors in comput-ing systems CHI rsquo09 pages 1589ndash1598 New York NYUSA 2009b ACM ISBN 978-1-60558-246-7 doi 10114515187011518944 URL httpdoiacmorg10114515187011518944

J Brooke SUS A quick and dirty usability scale In P WJordan B Weerdmeester A Thomas and I L Mclellandeditors Usability evaluation in industry Taylor and Fran-cis London 1996

Robert DeLine Mary Czerwinski Brian Meyers GinaVenolia Steven Drucker and George Robertson Codethumbnails Using spatial memory to navigate sourcecode In Proceedings of the Visual Languages and Human-Centric Computing VLHCC rsquo06 pages 11ndash18 Washing-ton DC USA 2006 IEEE Computer Society ISBN 0-7695-2586-5 doi 101109VLHCC200614 URL httpdxdoiorg101109VLHCC200614

Jonathan Edwards Example centric programming InCompanion to the 19th annual ACM SIGPLAN conferenceon Object-oriented programming systems languages andapplications OOPSLA rsquo04 pages 124ndash124 New YorkNY USA 2004 ACM ISBN 1-58113-833-4 doi 10114510286641028713 URL httpdoiacmorg10114510286641028713

Hakan Erdogmus Maurizio Morisio and Marco Torchi-ano On the effectiveness of the test-first approach toprogramming IEEE Transactions on Software Engineer-ing 31226ndash237 2005 ISSN 0098-5589 doi httpdoiieeecomputersocietyorg101109TSE200537

Kraig Finstad The System Usability Scale and Non-NativeEnglish Speakers Journal of Usability studies 1(4)185ndash188 2006 URL httpwwwupassocorgupa_publicationsjus2006_augustfinstad_sus_non_native_speakerspdf

Peter Henderson and Mark Weiser Continuous executionthe visiprog environment In Proceedings of the 8th inter-national conference on Software engineering ICSE rsquo85 pages68ndash74 Los Alamitos CA USA 1985 IEEE Computer So-ciety Press ISBN 0-8186-0620-7 URL httpdlacmorgcitationcfmid=319568319582

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 58: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

Bibliography 55

Andrew J Ko Brad A Myers and Htet Htet Aung Sixlearning barriers in end-user programming systems InProceedings of the 2004 IEEE Symposium on Visual Lan-guages - Human Centric Computing VLHCC rsquo04 pages199ndash206 Washington DC USA 2004 IEEE ComputerSociety ISBN 0-7803-8696-5 doi 101109VLHCC200447 URL httpdxdoiorg101109VLHCC200447

Donald A Norman The design of everyday things BasicBooks [New York] 1 basic paperback ed [nachdr] edi-tion 2002 ISBN 0-465-06710-7

J F Pane and B A Myers Usability issues in the design ofnovice programming systems Technical report CarnegieMellon University August 1996

David Saff and Michael D Ernst Continuous testing ineclipse Electron Notes Theor Comput Sci 107103ndash117December 2004a ISSN 1571-0661 doi 101016jentcs200402051 URL httpdxdoiorg101016jentcs200402051

David Saff and Michael D Ernst An experimental eval-uation of continuous testing during development SIG-SOFT Softw Eng Notes 29(4)76ndash85 July 2004b ISSN0163-5948 doi 10114510138861007523 URL httpdoiacmorg10114510138861007523

Bret Victor Inventing on principle cusec 2012

Laurie Williams E Michael Maximilien and Mladen VoukTest-driven development as a defect-reduction practiceIn Proceedings of the 14th International Symposium on Soft-ware Reliability Engineering ISSRE rsquo03 pages 34ndash Wash-ington DC USA 2003 IEEE Computer Society ISBN0-7695-2007-3 URL httpdlacmorgcitationcfmid=951952952364

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography
Page 59: A Live Coding Editor · prototype is very low fidelity and has some basic usabil-ity flaws (most importantly: Only 5 iterations can be dis-played and the functionality is very limited)

Typeset September 12 2012

  • Abstract
  • Acknowledgements
  • Introduction
  • Related Work
  • First User Study
    • The Prototypes
      • Interface Version 1
      • Interface Version 2
      • Interface Version 3
      • Interface Version 4
      • Interface Version 5
        • Course of the User Study
        • Results
          • Further Comments and Suggestions for Improvement
          • Conclusions
              • Implementation
                • First Approach (discharged)
                • Basic Evaluation Technique
                • How Values Are Being Calculated
                • Problems
                  • Second User Study
                    • Course of the User Study
                    • Results
                      • Debugging Times
                      • System Usability Scale
                      • Qualitative Results
                      • Discussion
                          • Summary and future work
                            • Summary
                            • Future Work
                              • Code for the Second User Study
                              • System Usability Scales
                              • Bibliography