statistics – or 155 section 1 j. s. marron, professor department of statistics and operations...

42
Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations

Upload: ariel-clark

Post on 17-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Statistics – OR 155Section 1

J. S. Marron, Professor

Department of Statistics

and Operations Research

Page 2: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Class Information

Handoutshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassInfo/Stor155-09FirstHandout.pdf

With:

• Blackboard Info

• Student Survey

(please fill out & return after class)

Page 3: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Class Information

Go to Blackboard (for class details):

• Website: http://blackboard.unc.edu/

• Log-in with Onyen

• Choose this course

• Control Panel > Content Areas

• Course Information

• Choose Item “Course Information”

Page 4: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Relationship to Textbook

• Ordering of material in textbook is usual

• But I don’t like it

(poorly motivated)

• So will change the order of the material

(for better motivation)

• Will jump around a lot through the text

Page 5: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 1-5, 197-203, 203-208

Approximate Reading for Next Class:

Pages 237-250

Page 6: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

What is Statistics?

Definition 1:

Gaining Insight from Numbers

(similar to text’s definition)

Definition 2:

The Science of Managing Uncertainty

Page 7: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

What is Statistics?

Subtopics:

• Gathering the Numbers– E.g. Statistician at a ball game– Will see: how this is done is critical

• Forming Conclusions– Will use math, etc.– Major focus of this course

Page 8: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Key Themes

I. Uncertainty

II. Variability

(will get quantitative about these)

Favorite Quote:“I was never good at math, but statistics is

easy, since it is just common sense”

Page 9: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Motivating Examples

1. Political Polls– Try to predict outcome of election– Too expensive to ask everyone– So ask some (hope they are “representative”)

2. Measurement Error– No measurement is exact

– Can improve by multiple measurements– How to model?

Lessons of these are broadly applicable

Page 10: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Common Structure

For both, find out abouttruth from a sample

E.g. 1: % for Cand. in population

% for Cand. in sample

E.g. 2: true sizeobserved measurement

Page 11: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Motivating Examples

1. Political Polls2. Measurement Error

Will study each using mathematical models

Do E.g. 1 first, since easier

Appropriate Models?

Page 12: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Political Polls

Appropriate Mathematical Models?

Depends on how data are gathered.

See Text, pages 171-177

• Seems easy???

• “Just choose some”???

• Take a look at history…

Page 13: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?History of Presidential Election Polls

During Campaigns, constantly hear in news “polls say …” How good are these? Why?

Page 14: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?History of Presidential Election Polls

During Campaigns, constantly hear in news “polls say …” How good are these? Why?

1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R

Page 15: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?History of Presidential Election Polls

During Campaigns, constantly hear in news “polls say …” How good are these? Why?

1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R

Result: 62% for R

Page 16: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?History of Presidential Election Polls

During Campaigns, constantly hear in news “polls say …” How good are these? Why?

1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R

Result: 62% for R

What happened?Sample size not big enough? 2.4 million

Biggest Poll ever done (before or since)

Page 17: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Bias in SamplingBias: Systematically favoring one outcome

(need to think carefully)

Selection Bias: Addresses from L. D.

readers, phone books, club memberships

(representative of population?)

Non-Response Bias: Return-mail survey

(who had time?)

Page 18: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?1936 Presidential Election (cont.)

Interesting Alternative Poll:

Gallup: 56% for R (sample size ~ 50,000)

Gallup of L.D. 44% for R ( ~ 3,000)

Page 19: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?1936 Presidential Election (cont.)

Interesting Alternative Poll:

Gallup: 56% for R (sample size ~ 50,000)

Gallup of L.D. 44% for R ( ~ 3,000)

Predicted both correct result (62% for R),

and L. D. error (43% for R)!

(how was improvement done?)

Page 20: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Improved Sampling

Gallup’s Improvements:

(i) Personal Interviews

(attacks non-response bias)

(ii) Quota Sampling

(attacks selection bias)

Page 21: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Quota SamplingIdea: make “sample like population”

So surveyor chooses people to give:i. Right % male

ii. Right % “young”

iii. Right % “blue collar”

iv. …

This worked fairly well (~5% error), until …

Page 22: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?1948 Dewey Truman sample size

Page 23: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?1948 Dewey Truman sample size

Crossley 50% 45%

Gallup 50% 44% ~50,000

Roper 53% 38% ~15,000

Page 24: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?1948 Dewey Truman sample size

Crossley 50% 45%

Gallup 50% 44% ~50,000

Roper 53% 38% ~15,000

Actual 45% 50% -

Page 25: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?1948 Dewey Truman sample size

Crossley 50% 45%

Gallup 50% 44% ~50,000

Roper 53% 38% ~15,000

Actual 45% 50% -

Note: Embarassing for polls, famous photo of Truman + Headline “Dewey Wins”

Page 26: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

How to sample?Note: Embarassing for polls, famous photo

of Truman + Headline “Dewey Wins”

Page 27: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

What went wrong?Problem: Unintentional Bias

(surveyors understood bias,

but still made choices)

Page 28: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

What went wrong?Problem: Unintentional Bias

(surveyors understood bias,

but still made choices)

Lesson: Human Choice can not give a Representative Sample

Page 29: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

What went wrong?Problem: Unintentional Bias

(surveyors understood bias,

but still made choices)

Lesson: Human Choice can not give a Representative Sample

Surprising Improvement: Random Sampling

Now called “scientific sampling”

Random = Scientific???

Page 30: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingKey Idea: “random error” is smaller than

“unintentional bias”, for large enough sample sizes

Page 31: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingKey Idea: “random error” is smaller than

“unintentional bias”, for large enough sample sizes

How large?

Current sample sizes: ~1,000 - 3,000

Page 32: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingKey Idea: “random error” is smaller than

“unintentional bias”, for large enough sample sizes

How large?

Current sample sizes: ~1,000 - 3,000

Note: now << 50,000 used in 1948.

So surveys are much cheaper

(thus many more done now….)

Page 33: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random Sampling

How Accurate?

• Can (& will) calculate using “probability”

• Justifies term “scientific sampling”

• 2nd improvement over quota sampling

Page 34: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingWhat is random?

Simple Random Sampling:

Each member of population is

equally likely to be in sample

Key Idea: Different from “just choose some”

Page 35: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingAn old (but still fun?) experiment:

Choose a number among 1,2,3,4

Page 36: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingAn old (but still fun?) experiment:

Choose a number among 1,2,3,4

Old typical results: about 70% choose “3”

(perhaps you have seen this before…)

Page 37: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random SamplingAn old (but still fun?) experiment:

Choose a number among 1,2,3,4

Old typical results: about 70% choose “3”

(perhaps you have seen this before…)

Main lesson: human choice does not give “equally likely” (i.e. random sample)

Page 38: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random Sampling

How to choose a random sample?

Old Approaches:

– Random Number Table

– Roll Dice

Modern Approach:

– Computer Generated

Page 39: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random Sampling HWInteresting Question:

What is the % of Male Students at UNC?

(Your chance of date,

or take 100% - to get your chance)

HW:

C1: Class Handouthttp://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf

Page 40: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random Sampling HWNotes on HW C1:• 3 dumb ways to sample, 1 good one• Goal is to learn about sampling,

Not “get right answer”• Part 1, put symbol for yourself, Ms and Fs

for others• Put both count & % (%100 x count / 25)• Part 2, “tally” is:• Part 4, student phone directory available

in Student Union?

Page 41: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random Sampling HWNotes on HW C1,

• Hints on Part 4:– For each draw, first draw a “random page”– Tools Data Analysis Random Number

Generation Uniform is one way to do this– In “Uniform”, you need to set “Parameters”, to

0 and “number of pages”– This gives a random decimal, to get an

integer, round up, using CEILING– In CEILING, set “significance” to 1

Page 42: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Random Sampling HWNotes on HW C1,

• Hints on Part 4 (cont.):– Next Choose Random Column– Next Choose Random Name– Caution: Different numbers on each page.– Challenge: still make equally likely– Approach: choose larger number– Approach: when not there, just toss it out– Approach: then do a “redraw”– Also redraw if can’t tell gender