empirical research methods in computer science lecture 2, part 1 october 19, 2005 noah smith

Empirical Research Methods in Computer Science

Lecture 2 Part 1October 19 2005Noah Smith

Some tips Perl scripts can be named encode instead

of encodepl encode foo ≢ encode lt foo chmod u+x encode Instead of making us run java Encode

write a shell script binsh cd `dirname $0` java Encode

Check that it works on (say) ugrad10

Assignment 1

If you didnrsquot turn in a first version yesterday donrsquot bother ndash just turn in the final version

Final version due Tuesday 1025 8pm

We will post a few exercises soon Questions

Today

Standard error Bootstrap for standard error Confidence intervals Hypothesis testing

Notation

P is a population S = [s1 s2 sn] is a sample from P

Let X = [x1 x2 xn] be some numerical measurement on the si distributed over P according to unknown F

We may use Y Z for other measurements

Mean

What does mean mean μx is population mean of x

(depends on F)

μx is in general unknown

How do we estimate the mean Sample mean

n

xx

n

1ii

Gzip compression rate

usually lt 1 but not always


Accuracy

How good an estimate is the sample mean

Standard error (se) of a statistic We picked one S from P How would vary if we picked a lot of

samples from P There is some ldquotruerdquo se value

x

Extreme cases

n rarr infin

n = 1

Standard error (of the sample mean)

Known

ldquoStandard errorrdquo = standard deviation of a statistic

n)x(se x

true standard deviation of x under F


Central Limit Theorem

The sampling distribution of the sample mean approaches a normal distribution as n increases

nμx

2xN

How to estimate σx

ldquoPlug-in principlerdquo

Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle

We donrsquot have (and canrsquot get) P We donrsquot know F the true distribution

over X We do have S (the sample)

We do know the sample distribution over X

Estimating a statistic use for F

F

F

Good and Bad News

We have a formula to estimate the standard error of the sample mean

We have a formula to estimate only the standard error of the sample mean variance median trimmed mean ratio of means of x and y correlation between x and y

Bootstrap world

unknown distribution F

observed random sample X

statistic of interest )X(sˆ

empirical distribution

bootstrap random sample X

bootstrap replication )X(sˆ

F

statistics about the estimate (eg standard error)

Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]

Draw n elements with replacement

Reflection

Imagine doing this with a pencil and paper

The bootstrap was born in 1979 Typically sampling is costly and

computation is cheap In (empirical) CS sampling isnrsquot even

necessarily all that costly

Bootstrap estimate of se

Let s() be a function for computing an estimate

True value of the standard error Ideal bootstrap estimate Bootstrap estimate with B boostrap

samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim

Bootstrap intuitively

We donrsquot know F We would like lots of samples from P

but we only have one (S) We approximate F by

Plug-in principle Easy to generate lots of ldquosamplesrdquo

from

F

F

B = 25 (mean compression)



Correlation (another statistic)

Population P sample S Two values xi and yi for each element

of the sample Correlation coefficient ρ Sample correlation coefficient

n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr

Example gzip compression

r = 09616

Accuracy of r

No general closed form for se(r) If we assume x and y are bivariate

Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality

Why assume the data are Gaussian

Alternative bootstrap estimate of the standard error of r

B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice

Plot the data Runtime Efron and Tibshirani

B = 25 is informative B = 50 often enough seldom need B gt 200 (for se)

Summary so far

A statistic is a ldquotrue factrdquo about the distribution F

We donrsquot know F For some parameter θ we want

estimate ldquoθ hatrdquo accuracy of that estimate (eg standard

error) For the mean μ we have a closed

form For other θ the bootstrap will help

Some tips Perl scripts can be named encode instead

of encodepl encode foo ≢ encode lt foo chmod u+x encode Instead of making us run java Encode

write a shell script binsh cd `dirname $0` java Encode

Check that it works on (say) ugrad10

Assignment 1




Today


Notation




Mean


(depends on F)



n

xx

n

1ii




Accuracy




x

Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Assignment 1




Today


Notation




Mean


(depends on F)



n

xx

n

1ii




Accuracy




x

Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Today


Notation




Mean


(depends on F)



n

xx

n

1ii




Accuracy




x

Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Notation




Mean


(depends on F)



n

xx

n

1ii




Accuracy




x

Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Mean


(depends on F)



n

xx

n

1ii




Accuracy




x

Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far









Accuracy




x

Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Extreme cases

n rarr infin

n = 1


Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far







Known


n)x(se x





nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far









nμx

2xN

How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






How to estimate σx


Therefore

n

1i

2i xx

n1

ˆ

n

1i

2

i

nxx

nˆ

xse

Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Plug-in principle





F

F

Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Good and Bad News



Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Bootstrap world







F


Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Bootstrap sample

X = [30 28 37 34 35] X could be

[28 34 37 34 35] [35 30 34 28 37] [35 35 34 30 28]


Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Reflection








samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far









samples

seF

FF

seˆse

BB seˆse


B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far







B

1i

2

B1B

ˆ]i[ˆˆse

FBB

seselim





from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far










from

F

F







n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far












n

1i

2i

n

1i

2i

n

1iii

yyxx

yyxxr


r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far







r = 09616

Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Accuracy of r


Gaussian

3n

r1)r(se

2

normal

-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






-1-05

005

110

2030

4050

6070

8090

100

-05

0

05

1

senormal

rn

senormal

Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






Normality



B

1i

2

B1B

r]i[rrse


r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far







r = 09616

senormal(r) = 00024

se200(r) = 00298

se bootstrap advice



Summary so far






se200(r) = 00298

se bootstrap advice



Summary so far






se bootstrap advice



Summary so far






Summary so far






empirical research methods in computer science lecture 2, part 1 october 19, 2005 noah smith

Documents