experiment design - colorado state university

21
Experiment Design CS 510 Lecture #26 April 28 th , 2014

Upload: others

Post on 22-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Experiment Design - Colorado State University

Experiment Design

CS 510 Lecture #26

April 28th, 2014

Page 2: Experiment Design - Colorado State University

PA4 •  Task:

– Conduct an experiment •  Ablation •  Replacement •  Sensitivity

– Write a report •  8 pages IEEE conference format max

•  Due May 9th

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   2  

Page 3: Experiment Design - Colorado State University

PA4 Details: Report Outline •  Six sections:

1.  Abstract 2.  Introduction 3.  Prior Work 4.  Methodology 5.  Experimental Results 6.  Conclusion & Future Work

•  Note: this is the basic outline of (almost) every CS paper

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   3  

Page 4: Experiment Design - Colorado State University

Introduction •  Hardest part of paper to write. It covers:

– What question are you trying to answer – Why is this question important – What is the context – How will the question be answered – What (briefly) will the reader learn

•  Assuming the reader is a computer scientist

•  Total length: 1 page or less

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   4  

Page 5: Experiment Design - Colorado State University

Prior Work •  Your experiment depends on your PA3

system – Describe it – Focusing on relevant issues for the

experiment •  In a “real” paper, you would also cover

other related works •  Describe how your paper adds to the field

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   5  

Page 6: Experiment Design - Colorado State University

Methodology

•  Describe your experiment design – Goal: what question is being answered –  Input/Output

•  Training Data •  Test Data •  Output

– Performance Metric(s)

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   6  

Page 7: Experiment Design - Colorado State University

Experimental Results •  Describe results of experiment

–  In text –  In figures/plots – With numbers (tables)

•  Interpret results for the reader – Present your conclusions – Link them to data – Hypothesize reasons, if appropriate

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   7  

Page 8: Experiment Design - Colorado State University

Conclusion & Future Work •  Conclusion

– Remind reader of goal – Remind reader why its important – Briefly restate conclusion with key supporting

data – 1 paragraph (sometimes 2)

•  Future Work – Describe the next experiment – 1 paragraph.

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   8  

Page 9: Experiment Design - Colorado State University

Abstract •  Comes first, but write it last •  Max 2 paragraphs

–  1 paragraph is better •  Summarizes the whole paper

– What is the goal – Why is it important – How is it tested – What are the results – What is the conclusion

•  OK to grab sentences from introduction & conclusion

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   9  

Page 10: Experiment Design - Colorado State University

Experiment Design •  Experiment design is goal driven

– What are you trying to show? – Formally: what is your hypothesis?

•  In support of that, choose – Training data – Test data – Ground truth data – Methodology – Performance metrics

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   10  

Page 11: Experiment Design - Colorado State University

Rules of Design

•  Select data – Enough to prove your point – Not more than you can process

•  Never test on training data – Partition training & test data

•  No overlapping data •  Overlapping actions?

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   11  

Page 12: Experiment Design - Colorado State University

Data Analysis •  Quantitative over qualitative •  Never make a statement that is not supported

by data •  Keep context in mind

– You experimented with a single system – How far do the results extend? – You can expand the reach with small sensitivity

studies •  E.g. we repeated the experiment for multiple codebook

sizes…

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   12  

Page 13: Experiment Design - Colorado State University

Review: ROC Curve

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   13  

0%   50%   100%  0%  

50%  

100%  BeGer  than  random  

False  Posi4ve  Percentage  

True

 Posi4ve  Percentage  

Page 14: Experiment Design - Colorado State University

Review : Computing ROCs •  For every test data, generate a (score, label) pair

–  Score is similarity score –  Label is true/false based on ground truth

•  Sort pairs based on scores –  Descending order for similarity scores –  Ascending for distance scores

•  Put a threshold between every set of non-equal scores •  For every threshold,

–  compute true positive percent –  false positive percent –  plot point

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   14  

Score  Label  

0.12  T  

0.13  T  

0.37  F  

1.01  T  

1.04  F  

1.05  T  

1.27  F  

Page 15: Experiment Design - Colorado State University

Data Analysis: Uncertainty

•  If you repeated the experiment with different data, would you get the same result?

•  Basic Approaches: – Run the experiment multiple times – Significance testing

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   15  

This  is  a  complex  topic.  I  am  just  going  to  present  two  simple  methods.  There  are  many  more.  

Page 16: Experiment Design - Colorado State University

N-fold Cross Validation

•  Divide your data into N partitions •  For each run:

– Train on N-1 partitions – Test on the remaining partition

•  Repeat N times with different test partitions

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   16  

Page 17: Experiment Design - Colorado State University

Analyzing Cross Validation Results •  AUC is a scalar

– Compute mean, st. dev., min, max. •  ROCs are curves

– Compute bounding curve – Min & max score for every false positive %

•  Remember – You are not comparing one cross-validation run to

another – You are comparing sets of runs for two

experimental conditions

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   17  

Page 18: Experiment Design - Colorado State University

Significance Testing •  This is a big topic in statistics •  One simple example: McNemar’s test

– Two algorithms – Run on the same data – Returning true/false for every sample

•  E.g. pick an operating point on the ROC curve

– Answers whether one is significantly better than another, based on sample size

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   18  

Page 19: Experiment Design - Colorado State University

McNemar’s Test

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   19  

                   A                        B  

                 C                    D  

Algorithm  A  True   False  Algorithm

 B  

True

 False

 

χ 2 =b− c( )2

b+ c

Page 20: Experiment Design - Colorado State University

Χ2 è p •  Χ2 increases if the algorithms make different mistakes from each other

•  Χ2 is smaller if the algorithms make similar mistakes

•  P is the probability that the differences between the two algorithms are by chance

•  Statistics calculators convert Χ2 and N to p – http://www.socscistatistics.com/pvalues/

chidistribution.aspx

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   20  

Page 21: Experiment Design - Colorado State University

Let’s Design an Experiment

•  Let’s get started: – Describe your system – Brainstorm hypotheses

•  Now let’s design an experiment…

4/30/14   CS  510,  Image  Computa4on,  ©Ross  Beveridge  &  Bruce  Draper   21