experimentaldesign - university of california, berkeleyanca/ahri-s2020/... · 2020. 3. 5. · the...

10
Experimentaldesign WHAT is an experiment what is a Good experiment i identify insight stest without confounds i what is an experiment i1 Origins medical drug effectiveness agriculture c pesticide effectiveness Fisher 1926 theu psychology cauderstaced people focus HRI Cfest design decisions Robotics compare algorithms test insights 1.2 Components treatments responses Ccoudetious measures a drug vs placebo a symptom progression random vs optimal Comsat CHOMP USGS CHOMP cost eapen.me aewutassignmeu subjects C subj allocation patients random c drug placebo users e every user sees both Kap problems every problem sees both 1.3 Assignment Between subjects one condition peruser no biases

Upload: others

Post on 19-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

ExperimentaldesignWHAT is an experimentwhat is a Good experiment

i identifyinsight stestwithoutconfounds

i what is an experimenti 1 Origins medical drugeffectiveness

agriculture cpesticideeffectivenessFisher 1926

theu psychology cauderstacedpeoplefocus HRI Cfestdesign decisions

Roboticscomparealgorithmstest insights1.2 Components

treatments responsesCcoudetious measuresa drug vs placebo a symptomprogressionrandom vs optimal ComsatCHOMPUSGSCHOMP cost

eapen.meaewutassignmeusubjects Csubj allocationpatients random cdrug placebousers eeveryusersees bothKapproblems everyproblem seesboth

1.3 AssignmentBetween subjects one conditionperuserno biases

Page 2: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

within subjects all conditionseliminate inter user variation

caseswhere within not possibleIgivepatients both drugsnot realistic robotcollaboration1coordinate

onceyouSinislethetaskyou know it

Mixed

1.4 Operationalizationof variablesIndependent variables coudetious

whatyoumanipulate e.g drugtype motiontypedependentvariables measures

whatyou measure eg symptoms cougat

1.5 HypothesisIv x affects Dv y

motivateby a mechanism n analogousstudies

Better Iv xpositivelyaffects Du yBecause optimalmotion is morepredictable wehypothesize that

HI Optimizingmotion increases user cougutBecausegoal sets addflexibility wehypothesioethat

Ht Includinggoalsets in optimizationreduces the final trajectory eat

Page 3: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

ayHypotheses extractkey insights

fy oIm9aeg HIagaeggpoguetagges

think aboutthe IV tscience notjustengineering

2 What is agoodexperiment2.1 Goodexperiments are contended

ucontrolled experimenter assigns experimentalunits to treatmentsas opposed to observational

e.g HowdoesestrogentreatmentaffecthealthObsstudy

outcomes

93676 women8yearsIIIIIffaeesh

estrogenposthoueffeatedwhatis wrong

ToCONFOUND health conciousnesscontrolled study estrogen negatively affects Hit

lg 2 lefthandednessObsstudy2000 peoplewhodied Lil dieyounger

Page 4: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

2 peple U de dLH died gyearsyounger

LH deeyoungerwhatIswrong2TfCONFOUNDDo LH condemned youngerpeople

equal amounts butolderpeople areright handed2 2 Good experiments avoidconfoundsucougound a variable whoseeffectcannotbe

distinguishedfromthe effectofthe IVe.g health consciousnessartificialdecreaseofLHgenderexperience with robotstrajectoryexecution timealgorithm metaparamelersRegoptimizes step seaToolsfor avoiding coyoceads

e Racedomidation2525OE E E

EM50Goa

Note hapha2and randomisedegalternate ABABAB

an A in morningall 13 in evening 3 deffpopulation

similar populationga eachcondition

within subjects counter balancetheorderD P Neconditions 7 N orderings

Page 5: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

I D l f Oude myP D u eatin square

stackthe cards againstyourselfraud pathlengthCHOMP squaredvelocity

DMipathleeyth

optimizethebaselinealgstamp stepsizeRRT improve RRT as well in all otherways

but parentselectionrewiringtestthekeg insight

2 2 Good experiments are reliablereliability Cow experimental error ClawvarianceToolsfor improving reliabilitywithin subjects exactsame user noeghihentbias

onBlocking9 blockgroupof homogenousexperimentalunits blocking arrangementgauetsinto blocks reducesknower but irrelevantsources of variation between coudetious

EI EI b P

EET E E EEVarCX y Varcx t Vancy 2 couchyminimize maximize

Page 6: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

blockingSautercovariates secondary variable thateau affecttheDM

Multi item scalespredictable expected surprising

2.3 Good experiments have constructvalidityconstructvalidity the measures actually

measure whatyou wantIQ test intelligenceratingof predictability predictability2Toolsobjective subjective measurespredesxfueftuds.mg EIBL7eDdeoooemcgf.gt

Construct validity

reliable reliableinvalid valid

2 4 Good experiments have externalvalidityUexternal validity conclusionsTools

Page 7: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

S

Samplefromthe targetpopulationEET EApplyyourinsightacrossproblems problemtypes even algorithms

goalsetsNO CHOMP STOMP

setupThgopt

YES GSTrajopt GSSTOMP

RepresentationWAYPOINTS CHOMP WP

Rkets CHOMARKHS I I2.5 Goodexperiments augactorial Tffactorial i conditions alleaubimatiousgall

levels of the IVsGS

ruts common pitfall adeaugusatouce

CHorapusTrajoPT 2udader treatment obstacles1st vs and soft hard

softsDF CHOMP MdaderCHOMP pointSDFhardigeonchohuffn Trajopt geomgeomgeomchamp

Page 8: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

cu ou

CHOMP WP us CHOMP RUHS us Trajopt RkitsWhatig Rkits helps Atomp butmottnajopt

Factorials break down the changes and extractwhat actuallymatters

Recapetreatmeutfre.peexp cents assignment

Htt manipulated variablesdependentmeasuressubj allocation

controlled not confounded reliable gawdfactorial

coustueettextuual3Anestartingpoimt73l I lU 2 levels within subjectsMP Problem Level Cost

gtappqbeem

D CootI O l O 2I 1 82 O y 2 42 I I

Page 9: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

test c If sample mean

I NTIsample sizesampleSD

N p value probabilityofobtaininga resultequal to a more extremethan uthasactuallyobserved whenthe mule hypothesis istrue

artificialthresholdjer significance E 05I larger more confidentNeargen more confident5 larger less confidentif Dv is binary1categorical chi squared XL

3.2 I lU L levels between subjectst Ie

VII t IIT3 3 Liv 2levelseach between subjects

THE 6 t tests Neumueilpee comparisonsproblem6

eachcomparison PCerror7 05PC 31 errors in 100Comparisons

1 Pc always correct

Page 10: Experimentaldesign - University of California, Berkeleyanca/AHRI-S2020/... · 2020. 3. 5. · the mule hypothesis istrue artificialthresholdjer significance E 05 I larger more confident

I 1 Calways care 7Cas indy I 95100 9941 Do

conservativecorrection Bayernoui a agame

better Tuckeycomparisons

Factorial designs OR IV with 72 levelsANOVA Analysisof Varianceprecursor to multiple comparisonsmain effects Nl NLinteraction effects we x i v2

a f ly wi I b f f ly w2 1

response 13a pcb 133ab tphadd user problem Jar withinsubjects

useful in roboticswhen multiplefactorIfactermultiplelevels

JMP MINI TAB