week 3 - sampling and study design · 2021. 2. 1. · week 3 - sampling and study design...

1/38

Week 3 - Sampling and Study Design

Ryan Miller

2/38

Week #3 Outline

I Video #1I Sampling from a Population

I Video #2I Sampling Examples

I Video #3I Study Design

I Video #4I Randomized Experiments

3/38

Introduction

I So far, we’ve focused on methods for describing dataI Descriptive statistics and graphs allow us to understand the

trends within our data

I While what’s happening in our data is interesting, we reallywould like generalize these findings to a broader set of casesI For example, we might use the fact that 14 of 16 infants chose

the “helper” character as evidence that all infants can identifyfriendly behavior

4/38

Populations

I Statisticians call this broader, comprehensive set of cases apopulationI It is extremely unusual for researchers to have data on the

entire population, it is much more common to have access to asample (a subset of cases from the population)

5/38

Sampling

When analyzing sample data, statisticians have two primaryconcerns:

1. Sampling Bias: If some cases are preferentially selected intothe sample, the data may no longer represent the entirepopulation

2. Sampling Variability: Even if all cases had an equal chance ofbeing selected, many different subsets are possible. So, anytrends seen in a single sample could possibly be explained bythe randomness involved in which cases ended up in thatsample.

6/38

Example - Sampling VariabilityIf every NFL player has an equal chance of being sampled:

Population (N = 2099 players)

Yearly Salary (millions)

Fre

quen

cy

0 5 10 15 20

050

010

0015

00

Sample #1 (n = 20 players)

Salary (millions)

Fre

quen

cy

0 5 10 15 20

05

1015


Salary (millions)

Fre

quen

cy

0 5 10 15 20

05

1015


Salary (millions)

Fre

quen

cy

0 5 10 15 20

05

1015


Salary (millions)F

requ

ency

0 5 10 15 20

05

1015

7/38

Example - Sampling BiasIf Quarterbacks are 10x more likely to be sampled:

Population (N = 2099 players)

Yearly Salary (millions)

Fre

quen

cy

0 5 10 15 20

050

010

0015

00


Salary (millions)

Fre

quen

cy

0 5 10 15 20

05

1015


Salary (millions)

Fre

quen

cy

0 5 10 15 20

05

1015


Salary (millions)

Fre

quen

cy

0 5 10 15 20

05

1015


Salary (millions)F

requ

ency

0 5 10 15 20

05

1015

8/38

Statistical Inference

I A fundamental goal of statisticians is to use information from asample to make reliable statements about a populationI This idea is called statistical inference

I Sampling bias and sampling variability are two obstacles tostatistical inference (as are confounding variables)

9/38

Statistical Inference - Notation

Statisticians use different mathematical notation to distinguishpopulation parameters (things we want to know) from estimates(things derived from a sample). The table below provides asummary:

Population Parameter Estimate (from sample)Mean µ x̄Standard Deviation σ sProportion p p̂Correlation ρ rRegression β0, β1 b0, b1

For example, µ is the mean of the target population, while x̄ is themean of the cases that ended up in the sample.

10/38

Bias and VariabilityIn summary, we’ve now discussed two reasons why an estimatemight not accurately reflect a population parameter, bias andvariability:

11/38

The Role of Sample Size

I Taking a larger sample will reduce sampling variability, but itwill not reduce sampling bias

12/38

Sampling Words from The Gettysburg Address

I This fall, I showed by MATH-256 students the text of theGettysburg Address (target population), asking each of them tocome up with a representative sample of 5 wordsI I then had them summarize their sample using the sample mean

(ie: the average word length of their 5 words)

I If their samples were actually representative, their samplemeans would cluster randomly around the population mean

13/38

The Gettysburg Address

14/38

Results

I The population’s mean word length, in statistical notation, isµ = 4.295I Why were the MATH-256 students’ samples so far off?

0.00

0.25

0.50

0.75

1.00

4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0

wordavgs

coun

t

15/38

Sampling Bias

I The answer is sampling bias, or a systematic tendency forsome cases from the population to be more likely to make itinto a sample than othersI In the Gettysburg Address example, students tended to pick

longer wordsI For example, perhaps the longer words stood out more

prominently, or the students felt awkward selecting multipletwo-letter words in their sample

16/38

Simple Random Sampling

I The ideal sampling procedure is simple random sampling, aprotocol where each case in the target population has anidentical chance of ending up in the sample

I Do you think it would be easy or hard to collect a simplerandom sample of Xavier students?

I It would actually be quite hard since the University is unlikely togive you a list of all enrolled students

I Often statisticians will attempt to get representative samples inother ways

16/38

Simple Random Sampling

I The ideal sampling procedure is simple random sampling, aprotocol where each case in the target population has anidentical chance of ending up in the sample

I Do you think it would be easy or hard to collect a simplerandom sample of Xavier students?I It would actually be quite hard since the University is unlikely to

give you a list of all enrolled studentsI Often statisticians will attempt to get representative samples in

other ways

17/38

Other Sampling Protocols

I A convenience sample is exactly what the name suggests, asample that is easily collected (ie: low monetary or time costs)I Convenience samples are not random, but they can be

representative if carefully selectedI You might be able to get a representative sample by standing

near the center of campus on a typical day and stoppingeveryone who walks by

I A stratified sample is a more complex scheme where thepopulation is broken into similar subcategories, which aresampled separately (typically simple random sampling)I The statistical methods we cover in this course won’t be

sufficient for this sampling schemeI Nevertheless, we should be able to recognize stratified sampling

for precisely that reason

18/38

Practice

For each scenario, determine whether it describes a population or asample, as well as whether or not the sample is biased.

1. To estimate the size of trout in a lake, an angler records theweight of the 12 trout he catches over a weekend

2. A subscription-based music website tracks the listening historyof its active users

3. The Department of Transportation announces that of the 250million registered cars in the US, 2.1% are hybrids

4. A car rental company installs an experimental data collectiondevice on the first 20 vehicles from an alphabetized list oflicense plates

19/38

Practice (solution)

1. This is a sample and it’s biased, there is no way that the anglerhas an equal chance of catching every trout in the lake

2. This is a population because it includes all of the website’susers

3. This is a population because it’s safe to assume the DOT hasregistration on the overwhelming majority of cars in the US

4. This is a sample and it’s unbiased, there is no reason to believethat license plate numbers are related to anything meaningfulabout each vehicle

20/38

Introduction (Study Design)

I So far, we’ve seen how sampling can influence the trends seenin our data, but there are other aspects of data collection thatwe need to consider

I Suppose researchers want to test a new COVID-19 treatment,how would you design a study to determine if it is effective ornot?

I A meaningful study must compare the new treatment withsomething elseI Therefore, we’ll either need to collect two samples, or

proactively split a single sample into two

21/38

Two Types of Studies

These two possibilities lead us to distinguish between two types ofstudies:

I Observational studies: the explanatory and responsevariables are observed by the researchers (separate samples)

I Experimental studies: the explanatory variable is assigned bythe researchers (the researchers split up a single sample)

22/38

Observational Studies

I We’ve already seen an example observational study in theFlorida Death Penalty Sentencing case study

I Recall the researchers recorded the race of the offender, as wellas whether the offender was sentenced to the death penalty ornotI Did the offender’s race appear associated with their sentence?

death notblack 38 142white 46 152

23/38

Confounding Variables

Overall, white offenders received the death penalty slightly moreoften, but this ignored the influence of the victim’s race:

Black Victim White Victim

Black Offender White Offender Black Offender White Offender

0.00

0.25

0.50

0.75

1.00

Sentence

death

not

24/38

Confounding Variables

Because offenders disproportionately committed crimes againstvictims of their own race, the overall death penalty rates wereskewed in a way that obscured the racially biased sentencing:

Black Victim White Victim

Black Offender White Offender Black Offender White Offender

0

50

100

150

Sentence

death

not

25/38

Balance

I We can view the problems caused by confounding variables asan issue of imbalanced groupsI Offenders were more likely to victimize their own race, and

crimes against whites tended to be punished more severelyI The groups white offenders and black offenders were

systematically different in an important way (victims race)

I Going back to the COVID-19 example, if study participants canchoose whether they receive the vaccine, then the vaccinegroup will likely be disproportionately older, sicker, workingriskier jobs, etc.I However, these factors would all occur in equal proportions in

the vaccinated and control groups if we randomly assignedwhich participants received the vaccine

26/38

Random Assignment

27/38

Barriers

I Obviously random assignment isn’t always feasible, someexplanatory variables are too unethical or costly to randomlyassignI For example, we couldn’t assign cases to consume toxic

chemicals or expose themselves to harmI We also cannot randomly assign explanatory variables that

universally pre-date the study like genetics, etc.

I Despite their flaws, observational studies are still very valuableI But they will always fall short of randomized experiments

28/38

Example - Study Design

I Suppose we want to know: “Is arthroscopic surgery is effectivein treating arthritis of the knee?” Describe both anobservational study and a randomized experiment that youcould conduct to answer this question. Be sure to address thefollowing during your discussion:

1. How costly will it be for the researchers to collect data witheach design?

2. Are there any feasibility problems or ethical issues with eachdesign?

29/38

Possible Responses

I An observational study might use medical records to findpatients with knee arthritis and identify whether they everreceived surgery. It then might compare various measures ofrecovery (recurrent visits, etc.) among those who had or didn’thave surgery.

I A randomized experiment would take a sample of patients witharthritis and randomly assign half to have surgery and the otherhalf to not have surgery. It then might compare variousmeasures of recovery.

1. The observational study costs almost nothing, while therandomized experiment likely will cost quite a bit

2. The observational study doesn’t present any ethical barriers(assuming sufficient data privacy), but withholding kneesurgery seems like it could problematic

30/38

Sham Knee Surgery

In the 1990s a study was conducted in 10 men with arthriticknees that were scheduled for surgery. They were all treatedidentically expect for one key distinction: only half ofthem actually got surgery! Once each subject was in theoperating room and anesthetized, the surgeon looked at arandomly generated code indicating whether he should dothe full surgery or just make three small incisions in theknee and stitch up the patient to leave a scar. All patientsreceived the same post-operative care, rehabilitation, andwere later evaluated by staff who didn’t know whether theyhad actually received the surgery or not. The result? Boththe sham knee surgery and the real knee surgery showedindistinguishable levels of improvement

Source: https://www.nytimes.com/2000/01/09/magazine/the-placebo-prescription.html

https://www.nytimes.com/2000/01/09/magazine/the-placebo-prescription.html

31/38

Control Groups, Placebos, and Blinding

The Sham Knee Surgery example illustrates several importantaspects of a well-designed experiment that we’ve yet to discuss:

I Control Group - Some patients were randomly assigned not toreceive the knee surgery, providing a comparison group that is,on average, balanced with surgery group in all baselinecharacteristics

I Placebo - Patients in the control group received a fake surgeryI Blinding - Using a placebo is not helpful if patients know the

group they’re in. Similarly, the staff interacting with thepatients might treat them differently if they knew the patient’sgroupI Single-blind - the participants don’t know the treatment

assignmentsI Double-blind - the participants and everyone interacting with

the participants don’t know the treatment assignments

32/38

Control Groups, Placebos, and Blinding

I The goal of each of these design elements is to prevent biasingthe measurement of the outcome variable in a particulardirection

I Other types of studies are susceptible to other types of biases,which we should also carefully consider

1. Social Desirability Bias - Respondents tend to answer questionsin ways that portray themselves in a positive light Link

2. Habituation Bias - Respondents tend to provide similar answersfor similarly worded or structured questions (the brain going onautopilot) Link

3. Leading Questions - The wording of a question impacts howpeople respond, great examples in the textbook

4. Cultural Bias - Questions are often to be constructed withone’s own culture in mind, they might not even make sense topeople from other cultures.

https://www.iser.essex.ac.uk/research/publications/working-papers/iser/2013-04.pdfhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605404/

33/38

Discussion - Can Randomization Fail?

I A University of Iowa researcher was conducting an experimenton lab monkeys

I Lab monkeys are expensive, so his experiment had n = 8I Having taken a statistics course, he randomly assigned

treatment/control groupsI After conducting the experiment and seeing surprising results,

the researcher recognizes that the 4 monkeys in the controlgroup were also the oldest 4 monkeys

I The researcher knew that the age of the monkey had animportant on the outcome variable, but he expectedrandomization to handle that

Should he report his results? What could he have done differently?

34/38

Can Randomization Fail?

I Randomization is not guaranteed to properly balance thetreatment and control groups unless the sample size isrelatively large

I At smaller sample sizes, strategies such as blocking can beusedI In this design, cases are first split using a blocking variable,

then random assignment is done within each blockI This ensures the blocking variable is balanced in each group

35/38

Blocking

36/38

Blocking

37/38

Study Design and Statistical Inference

I The overarching goal of a statistician is to rule out as manypossible explanations for an observed association as possible

I So far we’ve considered the following design-relatedexplanations, as well as methods for addressing for themI Sampling bias - Simple random samplingI Confounding variables - Random assignment of the explanatory

variable, or stratificationI Other biases - Using placebo, double-blinding, proper

instruments, etc.

38/38

Study Design and Statistical Inference

I As we discussed in Week 1 (babies choosing the helper orhinderer toys), when a study is well-designed the only viableexplanations for trends that are observed are:I Random chanceI The association is real

I We’ll soon learn how statisticians use probability theory todecide between these explanations

week 3 - sampling and study design · 2021. 2. 1. · week 3 - sampling and study design...

Documents