inf219 – software environments

Download INF219 – Software Environments

If you can't read please download the document

Upload: yasuo

Post on 25-Feb-2016

45 views

Category:

Documents


1 download

DESCRIPTION

INF219 – Software Environments. Second class Understanding a problem: Empirical Evaluation of Software Environments. Marco Aurélio Gerosa. University of California, Irvine Spring/2014. Creating a new tool. Inventor-oriented development of a tool. My tool is great!. - PowerPoint PPT Presentation

TRANSCRIPT

INF219 Software Environments

INF219 Software EnvironmentsUniversity of California, IrvineSpring/2014Marco Aurlio Gerosa

Second classUnderstanding a problem: Empirical Evaluation of Software EnvironmentsCreating a new toolInventor-oriented development of a toolSpring/2014Marco Aurlio Gerosa ([email protected])3

Clipart from http://www.clker.com/

My tool is great!Issues- Are you a typical developer? Really?- How do you know that your problem is really a problem?- How do you know the extension of your problem?- How do you know that you actually solved or mitigated the problem?- How do you identify collateral effects of your solution?Spring/2014Marco Aurlio Gerosa ([email protected])4Spring/2014Marco Aurlio Gerosa ([email protected])5Most of the developed tools are never used in practice

AnswerConduct empirical/experimental studiesSpring/2014Marco Aurlio Gerosa ([email protected])6

Spring/2014Marco Aurlio Gerosa ([email protected])7Why and when to do empirical studies while developing tools for software environments?

Researcher-based development of a toolSpring/2014Marco Aurlio Gerosa ([email protected])8

UnderstandConfirm (if necessary)

PrototypeEvaluateSurveyEmpirical Research vs Pure InventionResearch is more solid, systematic, and based on relatively low biased dataBut, it takes more time and it can refrain creativityIn practice, try to reach a balance However, if you want to publish, nowadays you really need to overkill but it is how science works

Remember:If I had asked people what they wanted, they would have said fasterhorses.(Henry Ford)You can avoid that if you collect the right data using the right meansAlso, you may understand better the real context and effects of the tool, making a better case for the tool adoptionHow many tools end up not being used?80s SigChi bulletin: ~90% of evaluative studies found no benets of tool

Spring/2014Marco Aurlio Gerosa ([email protected])9

LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfTypes of empirical studiesGoalExploratorydefine and understand problems and raise hypotheses Descriptivedescribe a situationConfirmatory / causal / explanatorytest hypotheses

Spring/2014Marco Aurlio Gerosa ([email protected])10Objects of analysisQuantitativeNumerical measurements / analysisQualitativeInterpretive data acquisition / analysis

What kinds of research can be used? Spring/2014Marco Aurlio Gerosa ([email protected])11

- Exploratory?- Descriptive?- Causal?- Quantitative?- Qualitative?

Empirical researchStudies provide evidences for or against theories

Theories of developer activityA model describing the strategy by which developers frequently do an activity that describes problems that can be addressed (design implications) through a better designed tool, language, or process that more effectively supports this strategy.

Spring/2014Marco Aurlio Gerosa ([email protected])12LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfModels & theoriesReality

EvidencesScientific studyExerciseLets improve how developers design software systems

How can we do that in a research-based way?

Spring/2014Marco Aurlio Gerosa ([email protected])13A single study will not answer all questions- Set scope- Describe limitations of study- Pick population to recruit participants from- Plan follow-up complementary studiesSpring/2014Marco Aurlio Gerosa ([email protected])14LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfRemember!Research MethodsExample of a cycleSpring/2014Marco Aurlio Gerosa ([email protected])16

LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfSome methods for exploratory studiesField observations / ethnographyObserve developers at work in the eldSurveys Ask many developers specic questionsInterviewsAsk a few developers open-ended questionsContextual inquiryAsk questions while developers workIndirect observationsStudy artifacts (e.g., code, code history, bugs, emails, ...)

To do high quality studies is quite difficult, but even quick-and-dirty ones can provide useful [Check the literature for additional guidelines]

Spring/2014Marco Aurlio Gerosa ([email protected])17LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfField observations / ethnographyFind software developersPick developers likely to be doing relevant workWatch developers do their work in their ofceAsk developers to think-aloudStream of consciousness: whatever they are thinking aboutThoughts, ideas, questions, hypotheses, etc.Register itSometimes can be invasive, but permits detailed analysisAudio: can analyze tasks, questions, goals, timingVideo: can analyze navigation, tool use, strategiesNotes: high level view of task, interesting observationsSpring/2014Marco Aurlio Gerosa ([email protected])18LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfSurveysCan reach many (100s, 1000s) developers Websites to run surveys (e.g., SurveyMonkey, Google Docs)Find participantsProbabilistic sampling is only possible for small and well defined population (e.g. within a company)Snowball sampling (mailing list, twitter etc.)Prepare multiple choice & free response questionsMultiple choice: faster, standardized responseFree response: more time, more detail, open-endedBackground & demographics questionsE.g., experience, time in team, state of project, ....Open commentsSpring/2014Marco Aurlio Gerosa ([email protected])19LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfSemi-structured interviewsDefine a scriptPrompt developer with question on focus areasLet developer talkFollow to lead discussion towards interesting topics Manage timeMove to next topic to ensure all topics coveredIt is hard to not bias the interviewee

Spring/2014Marco Aurlio Gerosa ([email protected])20LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfContextual inquiryInterview while doing eld observationsLearn about environment, work, tasks, culture, breakdownsPrinciples of contextual inquiryContext - understand work in natural environmentAsk to see current work being doneSeek concrete data - ask to show work, not tellBad: usually, generally Good: Heres how, let me show youPartnership - close collaboration with userUser is the expertInterpretation - make sense of work activityRephrase, ask for examples, question terms & concepts Focus - perspective that denes questions of interestSpring/2014Marco Aurlio Gerosa ([email protected])21LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdf

Indirect observationsIndirect record of developer activityExamples of artifacts (where to get it)Code & code changes (version control systems)Code changesBugs (bug tracking software)Emails (project mailing lists, help lists for APIs)You can also collect data from instrumented tool (e.g., Hackstat)Advantage:Lots of data, easy to obtainDisadvantages:Can only observe what is in the dataSpring/2014Marco Aurlio Gerosa ([email protected])22LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfExamplesWhich methods would you use in these situations?1. Youd like to design a tool to help web developers reuse code more easily.2. Youd like to help developers better prioritize bugs to be xed.Spring/2014Marco Aurlio Gerosa ([email protected])23LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfField observations?Surveys?Interviews?Contextual inquiry?Indirect observations?EvaluationOk, you figured out a problem and conceived a tool.But is this the right tool? Would it really help?Which features are most important to implement?Solution: low cost evaluation studiesEvaluate mockups and prototypes before you build the tool!Tool isnt helpful: come up with a new ideaUsers have problems using tool: x the problemsSpring/2014Marco Aurlio Gerosa ([email protected])24LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfLow cost evaluation methodsPaper prototypingDo tasks on paper mockups of real toolSimulate tool on paperWizard of OzSimulate tool by computing results by handHeuristic evaluationAssess tool for good usability designCognitive walkthroughSimulate actions needed to complete taskSpring/2014Marco Aurlio Gerosa ([email protected])25LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfPaper prototypingBuild paper mockup of the toolMay be rough sketch or realistic screenshotsOften surprisingly effective Experimenter plays the computerExperimenter simulates tool by adding / changing papersGood for checking if userUnderstands interface terminologyCommands users want match actual commandsUnderstands what tool doesFinds the tool usefulChallenges - must anticipate commands usedIteratively add commands from previous participantsPrompt users to try it a different waySpring/2014Marco Aurlio Gerosa ([email protected])26LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfWizard of OzParticipant believes (or pretends) to interact with real toolExperimenter simulates (behind the curtain) toolComputes data used by tool by handParticipants computer is slave to experimenters computer Especially for AI and other hard-to-implement systems E.g.: Voice user interface - experimenter translates speech to textAdvantagesHigh delity - user can use actual tool before its builtDisadvantagesRequires working GUI, unlike paper prototypesSpring/2014Marco Aurlio Gerosa ([email protected])27LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfPrototypingPaperImplemented UI (no business logic)Wizard of OzImplemented prototype Real systemSpring/2014Marco Aurlio Gerosa ([email protected])28LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfIncreased fidelityBetter if sketchier for early design - Use paper or sketchy tools, not real widgets - People focus on wrong issues: colors, alignment, names - Rather than overall structure and fundamental design Heuristic evaluation [Nielsen]Multiple evaluators use dimensions to identify usability problemsEvaluators aggregate problems & clarifyStructured assessment based on experienceProblems may be are categorized according to their estimated impact on user performance or acceptanceExamples of heuristics: (Jakob Nielsen)Visibility of system status; Match between system and the real world; User control and freedom; Consistency and standards; Error prevention; Recognition rather than recall; Flexibility and efficiency of use; Aesthetic andminimalist design; Help users recognize, diagnose, and recover from errors; Help and documentationAdvantage:Users are not necessaryDisadvantageHighly influenced by the knowledge of the expert reviewerSpring/2014Marco Aurlio Gerosa ([email protected])29LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfCognitive walkthroughHow easy it is for new users to accomplish tasks with the system?Cognitive walkthroughis task-specificTask analysis - sequence of actions required by a user to accomplish a task and the responses from the system to those actionsEvaluators walk through the steps, asking themselves a set of questionsWill the user try to achieve the effect that the subtask has?Does the user understand that this subtask is needed to reach the user's goal?Will the user notice that the correct action is available?Will the user understand that the wanted subtask can be achieved by the action?Does the user get feedback?Will the user know that they have done the right thing after performing the action?

Spring/2014Marco Aurlio Gerosa ([email protected])30ExerciseHow would you use the evaluation methods in this situation?Youre designing a new notation for visualizing softwareSpring/2014Marco Aurlio Gerosa ([email protected])31LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfCodecity: http://www.inf.usi.ch/phd/wettel/codecity.html

Paper prototypingWizard of OzHeuristic evaluationCognitive walkthrough

More robust evaluationsYou want to write a paper claiming that your tool is usefulYou want to get a company to try it out.Solution: run a higher cost, but more convincing evaluation study

Lab experiments - controlled experiment to compare toolsMeasure differences of your tool w/ competitorsUsually based on quantitative evidenceField deploymentsUsers try your tool in their own workData: usefulness perceptions, how use toolCan be more qualitativeSpring/2014Marco Aurlio Gerosa ([email protected])32LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfLab studiesUsers complete tasks using your tool or competitorsWithin subjects design - all participants use bothBetween subjects design - participants use oneTypical measures - time, bugs, quality, user perceptionAlso measures from exploratory observations (think-aloud)More detailed measures = better understood resultsAdvantage Controlled experiment (more precision)Disadvantages Less realism and generalizabilityUsers still learning how to use tool, unfamiliar with codeBenets may require longer taskSpring/2014Marco Aurlio Gerosa ([email protected])33LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfField deploymentsGive your tool to developers. See how they use itLow control, more realismData collection: interviews, logging data, observationsQualitative measuresPerception: do they like the tool?Use frequency: how often do they use it?Uses: how do they use it? what questions? tasks? why?Wishes: what else would they like to use it for?Quantitative comparison is possible, but it is hard Different tasks, users, codeSpring/2014Marco Aurlio Gerosa ([email protected])34LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfAnalysisTechniques for qualitative data analysisContextual designSet of models for understanding how work is done Content analysis / grounded theoryTechnique for analyzing textsUsed both to nd patterns in data & convert to quantitative dataProcess modelsModels of steps users do in a taskTaxonomiesWhat things exist, how are they different, and how are they related?Afnity diagramsTechnique for synthesizing many disparate observations or interpretations into a coherent wholeSpring/2014Marco Aurlio Gerosa ([email protected])36LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfQuantitative data analysisFrequencyHow often do things occur? (counts, %s, avg times, ...)Descriptive statisticsCorrelationalHow are multiple variables related? (correlations, ....)Estimate variable x from variables a, b, c (regression, classiers, ....)Controlled experiment (causal)Statistical test

Spring/2014Marco Aurlio Gerosa ([email protected])37Qualitative vs. quantitativeQualitative analysis most useful forFiguring out whats there, whats being doneWhats important?How are things done?Why is person doing / using / thinking something?LimitationsSmall n: few examples, may not generalizeInterpretations could be biased Quantitative analysis most useful forTesting hypothesesInvestigating relationships between variablesPredictingLimitationsLack interpretationTherefore, great studies mix both approachesSpring/2014Marco Aurlio Gerosa ([email protected])38LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfOther aspects of a scientific studyWhat to observeLearning a new toolPilot studies IRB ApprovalMaximize validitymore participants, data collected, measureslonger tasksmore realistic conditionsMinimize costfewer participants, data collected, measuresshorter tasksless realistic, easier to replicate conditions

Studies are not proofs - results could always be invalidSoftware Engineering is very context dependent Dont sample all developers x tasks x situations, and measures are imperfectSearch results that are interestingrelevant to research questionsvalid enough so that your target audience believes them

Spring/2014Marco Aurlio Gerosa ([email protected])40LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfSome types of validityValidity = should we believe the results?Construct validityDoes measure correspond to construct or something else?External validityDo results generalize from participants to population?Internal validityAre the differences between conditions caused only by experimental manipulation and not other variables? (confounds)Spring/2014Marco Aurlio Gerosa ([email protected])41LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfSee also:https://en.wikipedia.org/wiki/Bias_(statistics)https://en.wikipedia.org/wiki/Cognitive_bias ConclusionFinal considerationsField studies of programmers reveal interesting new areas for tool research and developmentCan focus research on important problemsDesign from Data about real problems, barriers, opportunitiesFollowing research methods will result in better quality toolsMore usable, effective, etc.Software Engineering tools and methods often benefit from valid evaluation with peopleNeed real evidence to answer questions about what is better/faster/easierOften demanded by reviewersRelevant to any claims of better/faster/easier for peopleThere are valid evaluation criteria beyond Taste, Intuition, My experience, and anecdotes

Spring/2014Marco Aurlio Gerosa ([email protected])43LaToza (2011) http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfSummaryPure invention x empirical research approach for developing innovative toolsWhen to do empirical studiesTypes of empirical studiesResearch methodsexploratory studieslow cost evaluationrobust evaluationQualitative and quantitative analysisSome points to observe when conducting studiesValidity

Spring/2014Marco Aurlio Gerosa ([email protected])44To DoDefine the groups and the topic of your seminarRead the papers assigned for the 3rd class (check your group):http://www.ime.usp.br/~gerosa/classes/uci/inf219Produce the slides-based summary for the two papers

Wherever you go, go with all yourheart. (Confucius)

Spring/2014Marco Aurlio Gerosa ([email protected])45Spring/2014Marco Aurlio Gerosa ([email protected])46Thank you!See you next class

Marco GerosaWeb site: http://www.ime.usp.br/~gerosaEmail: [email protected] / [email protected] Office: DBH 5228 (by appointment) AcknowledgmentsThis class was based on the slides from Thomas LaToza:http://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture02-HCI%20Methods%201.pdfhttp://www.cs.cmu.edu/~bam/uicourse/2011hasd/lecture03-HCI%20Methods%202.pdf

Spring/2014Marco Aurlio Gerosa ([email protected])47