week 3 - sampling and study design · 2021. 2. 1. · week 3 - sampling and study design...
TRANSCRIPT
-
1/38
Week 3 - Sampling and Study Design
Ryan Miller
-
2/38
Week #3 Outline
I Video #1I Sampling from a Population
I Video #2I Sampling Examples
I Video #3I Study Design
I Video #4I Randomized Experiments
-
3/38
Introduction
I So far, we’ve focused on methods for describing dataI Descriptive statistics and graphs allow us to understand the
trends within our data
I While what’s happening in our data is interesting, we reallywould like generalize these findings to a broader set of casesI For example, we might use the fact that 14 of 16 infants chose
the “helper” character as evidence that all infants can identifyfriendly behavior
-
3/38
Introduction
I So far, we’ve focused on methods for describing dataI Descriptive statistics and graphs allow us to understand the
trends within our data
I While what’s happening in our data is interesting, we reallywould like generalize these findings to a broader set of casesI For example, we might use the fact that 14 of 16 infants chose
the “helper” character as evidence that all infants can identifyfriendly behavior
-
4/38
Populations
I Statisticians call this broader, comprehensive set of cases apopulationI It is extremely unusual for researchers to have data on the
entire population, it is much more common to have access to asample (a subset of cases from the population)
-
5/38
Sampling
When analyzing sample data, statisticians have two primaryconcerns:
1. Sampling Bias: If some cases are preferentially selected intothe sample, the data may no longer represent the entirepopulation
2. Sampling Variability: Even if all cases had an equal chance ofbeing selected, many different subsets are possible. So, anytrends seen in a single sample could possibly be explained bythe randomness involved in which cases ended up in thatsample.
-
5/38
Sampling
When analyzing sample data, statisticians have two primaryconcerns:
1. Sampling Bias: If some cases are preferentially selected intothe sample, the data may no longer represent the entirepopulation
2. Sampling Variability: Even if all cases had an equal chance ofbeing selected, many different subsets are possible. So, anytrends seen in a single sample could possibly be explained bythe randomness involved in which cases ended up in thatsample.
-
6/38
Example - Sampling VariabilityIf every NFL player has an equal chance of being sampled:
Population (N = 2099 players)
Yearly Salary (millions)
Fre
quen
cy
0 5 10 15 20
050
010
0015
00
Sample #1 (n = 20 players)
Salary (millions)
Fre
quen
cy
0 5 10 15 20
05
1015
Sample #2 (n = 20 players)
Salary (millions)
Fre
quen
cy
0 5 10 15 20
05
1015
Sample #3 (n = 20 players)
Salary (millions)
Fre
quen
cy
0 5 10 15 20
05
1015
Sample #4 (n = 20 players)
Salary (millions)F
requ
ency
0 5 10 15 20
05
1015
-
7/38
Example - Sampling BiasIf Quarterbacks are 10x more likely to be sampled:
Population (N = 2099 players)
Yearly Salary (millions)
Fre
quen
cy
0 5 10 15 20
050
010
0015
00
Sample #1 (n = 20 players)
Salary (millions)
Fre
quen
cy
0 5 10 15 20
05
1015
Sample #2 (n = 20 players)
Salary (millions)
Fre
quen
cy
0 5 10 15 20
05
1015
Sample #3 (n = 20 players)
Salary (millions)
Fre
quen
cy
0 5 10 15 20
05
1015
Sample #4 (n = 20 players)
Salary (millions)F
requ
ency
0 5 10 15 20
05
1015
-
8/38
Statistical Inference
I A fundamental goal of statisticians is to use information from asample to make reliable statements about a populationI This idea is called statistical inference
I Sampling bias and sampling variability are two obstacles tostatistical inference (as are confounding variables)
-
9/38
Statistical Inference - Notation
Statisticians use different mathematical notation to distinguishpopulation parameters (things we want to know) from estimates(things derived from a sample). The table below provides asummary:
Population Parameter Estimate (from sample)Mean µ x̄Standard Deviation σ sProportion p p̂Correlation ρ rRegression β0, β1 b0, b1
For example, µ is the mean of the target population, while x̄ is themean of the cases that ended up in the sample.
-
10/38
Bias and VariabilityIn summary, we’ve now discussed two reasons why an estimatemight not accurately reflect a population parameter, bias andvariability:
-
11/38
The Role of Sample Size
I Taking a larger sample will reduce sampling variability, but itwill not reduce sampling bias
-
12/38
Sampling Words from The Gettysburg Address
I This fall, I showed by MATH-256 students the text of theGettysburg Address (target population), asking each of them tocome up with a representative sample of 5 wordsI I then had them summarize their sample using the sample mean
(ie: the average word length of their 5 words)
I If their samples were actually representative, their samplemeans would cluster randomly around the population mean
-
12/38
Sampling Words from The Gettysburg Address
I This fall, I showed by MATH-256 students the text of theGettysburg Address (target population), asking each of them tocome up with a representative sample of 5 wordsI I then had them summarize their sample using the sample mean
(ie: the average word length of their 5 words)
I If their samples were actually representative, their samplemeans would cluster randomly around the population mean
-
13/38
The Gettysburg Address
-
14/38
Results
I The population’s mean word length, in statistical notation, isµ = 4.295I Why were the MATH-256 students’ samples so far off?
0.00
0.25
0.50
0.75
1.00
4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0
wordavgs
coun
t
-
15/38
Sampling Bias
I The answer is sampling bias, or a systematic tendency forsome cases from the population to be more likely to make itinto a sample than othersI In the Gettysburg Address example, students tended to pick
longer wordsI For example, perhaps the longer words stood out more
prominently, or the students felt awkward selecting multipletwo-letter words in their sample
-
16/38
Simple Random Sampling
I The ideal sampling procedure is simple random sampling, aprotocol where each case in the target population has anidentical chance of ending up in the sample
I Do you think it would be easy or hard to collect a simplerandom sample of Xavier students?
I It would actually be quite hard since the University is unlikely togive you a list of all enrolled students
I Often statisticians will attempt to get representative samples inother ways
-
16/38
Simple Random Sampling
I The ideal sampling procedure is simple random sampling, aprotocol where each case in the target population has anidentical chance of ending up in the sample
I Do you think it would be easy or hard to collect a simplerandom sample of Xavier students?I It would actually be quite hard since the University is unlikely to
give you a list of all enrolled studentsI Often statisticians will attempt to get representative samples in
other ways
-
17/38
Other Sampling Protocols
I A convenience sample is exactly what the name suggests, asample that is easily collected (ie: low monetary or time costs)I Convenience samples are not random, but they can be
representative if carefully selectedI You might be able to get a representative sample by standing
near the center of campus on a typical day and stoppingeveryone who walks by
I A stratified sample is a more complex scheme where thepopulation is broken into similar subcategories, which aresampled separately (typically simple random sampling)I The statistical methods we cover in this course won’t be
sufficient for this sampling schemeI Nevertheless, we should be able to recognize stratified sampling
for precisely that reason
-
17/38
Other Sampling Protocols
I A convenience sample is exactly what the name suggests, asample that is easily collected (ie: low monetary or time costs)I Convenience samples are not random, but they can be
representative if carefully selectedI You might be able to get a representative sample by standing
near the center of campus on a typical day and stoppingeveryone who walks by
I A stratified sample is a more complex scheme where thepopulation is broken into similar subcategories, which aresampled separately (typically simple random sampling)I The statistical methods we cover in this course won’t be
sufficient for this sampling schemeI Nevertheless, we should be able to recognize stratified sampling
for precisely that reason
-
18/38
Practice
For each scenario, determine whether it describes a population or asample, as well as whether or not the sample is biased.
1. To estimate the size of trout in a lake, an angler records theweight of the 12 trout he catches over a weekend
2. A subscription-based music website tracks the listening historyof its active users
3. The Department of Transportation announces that of the 250million registered cars in the US, 2.1% are hybrids
4. A car rental company installs an experimental data collectiondevice on the first 20 vehicles from an alphabetized list oflicense plates
-
19/38
Practice (solution)
1. This is a sample and it’s biased, there is no way that the anglerhas an equal chance of catching every trout in the lake
2. This is a population because it includes all of the website’susers
3. This is a population because it’s safe to assume the DOT hasregistration on the overwhelming majority of cars in the US
4. This is a sample and it’s unbiased, there is no reason to believethat license plate numbers are related to anything meaningfulabout each vehicle
-
20/38
Introduction (Study Design)
I So far, we’ve seen how sampling can influence the trends seenin our data, but there are other aspects of data collection thatwe need to consider
I Suppose researchers want to test a new COVID-19 treatment,how would you design a study to determine if it is effective ornot?
I A meaningful study must compare the new treatment withsomething elseI Therefore, we’ll either need to collect two samples, or
proactively split a single sample into two
-
20/38
Introduction (Study Design)
I So far, we’ve seen how sampling can influence the trends seenin our data, but there are other aspects of data collection thatwe need to consider
I Suppose researchers want to test a new COVID-19 treatment,how would you design a study to determine if it is effective ornot?
I A meaningful study must compare the new treatment withsomething elseI Therefore, we’ll either need to collect two samples, or
proactively split a single sample into two
-
21/38
Two Types of Studies
These two possibilities lead us to distinguish between two types ofstudies:
I Observational studies: the explanatory and responsevariables are observed by the researchers (separate samples)
I Experimental studies: the explanatory variable is assigned bythe researchers (the researchers split up a single sample)
-
22/38
Observational Studies
I We’ve already seen an example observational study in theFlorida Death Penalty Sentencing case study
I Recall the researchers recorded the race of the offender, as wellas whether the offender was sentenced to the death penalty ornotI Did the offender’s race appear associated with their sentence?
death notblack 38 142white 46 152
-
23/38
Confounding Variables
Overall, white offenders received the death penalty slightly moreoften, but this ignored the influence of the victim’s race:
Black Victim White Victim
Black Offender White Offender Black Offender White Offender
0.00
0.25
0.50
0.75
1.00
Sentence
death
not
-
24/38
Confounding Variables
Because offenders disproportionately committed crimes againstvictims of their own race, the overall death penalty rates wereskewed in a way that obscured the racially biased sentencing:
Black Victim White Victim
Black Offender White Offender Black Offender White Offender
0
50
100
150
Sentence
death
not
-
25/38
Balance
I We can view the problems caused by confounding variables asan issue of imbalanced groupsI Offenders were more likely to victimize their own race, and
crimes against whites tended to be punished more severelyI The groups white offenders and black offenders were
systematically different in an important way (victims race)
I Going back to the COVID-19 example, if study participants canchoose whether they receive the vaccine, then the vaccinegroup will likely be disproportionately older, sicker, workingriskier jobs, etc.I However, these factors would all occur in equal proportions in
the vaccinated and control groups if we randomly assignedwhich participants received the vaccine
-
25/38
Balance
I We can view the problems caused by confounding variables asan issue of imbalanced groupsI Offenders were more likely to victimize their own race, and
crimes against whites tended to be punished more severelyI The groups white offenders and black offenders were
systematically different in an important way (victims race)
I Going back to the COVID-19 example, if study participants canchoose whether they receive the vaccine, then the vaccinegroup will likely be disproportionately older, sicker, workingriskier jobs, etc.I However, these factors would all occur in equal proportions in
the vaccinated and control groups if we randomly assignedwhich participants received the vaccine
-
26/38
Random Assignment
-
27/38
Barriers
I Obviously random assignment isn’t always feasible, someexplanatory variables are too unethical or costly to randomlyassignI For example, we couldn’t assign cases to consume toxic
chemicals or expose themselves to harmI We also cannot randomly assign explanatory variables that
universally pre-date the study like genetics, etc.
I Despite their flaws, observational studies are still very valuableI But they will always fall short of randomized experiments
-
27/38
Barriers
I Obviously random assignment isn’t always feasible, someexplanatory variables are too unethical or costly to randomlyassignI For example, we couldn’t assign cases to consume toxic
chemicals or expose themselves to harmI We also cannot randomly assign explanatory variables that
universally pre-date the study like genetics, etc.
I Despite their flaws, observational studies are still very valuableI But they will always fall short of randomized experiments
-
28/38
Example - Study Design
I Suppose we want to know: “Is arthroscopic surgery is effectivein treating arthritis of the knee?” Describe both anobservational study and a randomized experiment that youcould conduct to answer this question. Be sure to address thefollowing during your discussion:
1. How costly will it be for the researchers to collect data witheach design?
2. Are there any feasibility problems or ethical issues with eachdesign?
-
29/38
Possible Responses
I An observational study might use medical records to findpatients with knee arthritis and identify whether they everreceived surgery. It then might compare various measures ofrecovery (recurrent visits, etc.) among those who had or didn’thave surgery.
I A randomized experiment would take a sample of patients witharthritis and randomly assign half to have surgery and the otherhalf to not have surgery. It then might compare variousmeasures of recovery.
1. The observational study costs almost nothing, while therandomized experiment likely will cost quite a bit
2. The observational study doesn’t present any ethical barriers(assuming sufficient data privacy), but withholding kneesurgery seems like it could problematic
-
30/38
Sham Knee Surgery
In the 1990s a study was conducted in 10 men with arthriticknees that were scheduled for surgery. They were all treatedidentically expect for one key distinction: only half ofthem actually got surgery! Once each subject was in theoperating room and anesthetized, the surgeon looked at arandomly generated code indicating whether he should dothe full surgery or just make three small incisions in theknee and stitch up the patient to leave a scar. All patientsreceived the same post-operative care, rehabilitation, andwere later evaluated by staff who didn’t know whether theyhad actually received the surgery or not. The result? Boththe sham knee surgery and the real knee surgery showedindistinguishable levels of improvement
Source: https://www.nytimes.com/2000/01/09/magazine/the-placebo-prescription.html
https://www.nytimes.com/2000/01/09/magazine/the-placebo-prescription.html
-
31/38
Control Groups, Placebos, and Blinding
The Sham Knee Surgery example illustrates several importantaspects of a well-designed experiment that we’ve yet to discuss:
I Control Group - Some patients were randomly assigned not toreceive the knee surgery, providing a comparison group that is,on average, balanced with surgery group in all baselinecharacteristics
I Placebo - Patients in the control group received a fake surgeryI Blinding - Using a placebo is not helpful if patients know the
group they’re in. Similarly, the staff interacting with thepatients might treat them differently if they knew the patient’sgroupI Single-blind - the participants don’t know the treatment
assignmentsI Double-blind - the participants and everyone interacting with
the participants don’t know the treatment assignments
-
32/38
Control Groups, Placebos, and Blinding
I The goal of each of these design elements is to prevent biasingthe measurement of the outcome variable in a particulardirection
I Other types of studies are susceptible to other types of biases,which we should also carefully consider
1. Social Desirability Bias - Respondents tend to answer questionsin ways that portray themselves in a positive light Link
2. Habituation Bias - Respondents tend to provide similar answersfor similarly worded or structured questions (the brain going onautopilot) Link
3. Leading Questions - The wording of a question impacts howpeople respond, great examples in the textbook
4. Cultural Bias - Questions are often to be constructed withone’s own culture in mind, they might not even make sense topeople from other cultures.
https://www.iser.essex.ac.uk/research/publications/working-papers/iser/2013-04.pdfhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605404/
-
32/38
Control Groups, Placebos, and Blinding
I The goal of each of these design elements is to prevent biasingthe measurement of the outcome variable in a particulardirection
I Other types of studies are susceptible to other types of biases,which we should also carefully consider
1. Social Desirability Bias - Respondents tend to answer questionsin ways that portray themselves in a positive light Link
2. Habituation Bias - Respondents tend to provide similar answersfor similarly worded or structured questions (the brain going onautopilot) Link
3. Leading Questions - The wording of a question impacts howpeople respond, great examples in the textbook
4. Cultural Bias - Questions are often to be constructed withone’s own culture in mind, they might not even make sense topeople from other cultures.
https://www.iser.essex.ac.uk/research/publications/working-papers/iser/2013-04.pdfhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605404/
-
33/38
Discussion - Can Randomization Fail?
I A University of Iowa researcher was conducting an experimenton lab monkeys
I Lab monkeys are expensive, so his experiment had n = 8I Having taken a statistics course, he randomly assigned
treatment/control groupsI After conducting the experiment and seeing surprising results,
the researcher recognizes that the 4 monkeys in the controlgroup were also the oldest 4 monkeys
I The researcher knew that the age of the monkey had animportant on the outcome variable, but he expectedrandomization to handle that
Should he report his results? What could he have done differently?
-
34/38
Can Randomization Fail?
I Randomization is not guaranteed to properly balance thetreatment and control groups unless the sample size isrelatively large
I At smaller sample sizes, strategies such as blocking can beusedI In this design, cases are first split using a blocking variable,
then random assignment is done within each blockI This ensures the blocking variable is balanced in each group
-
35/38
Blocking
-
36/38
Blocking
-
37/38
Study Design and Statistical Inference
I The overarching goal of a statistician is to rule out as manypossible explanations for an observed association as possible
I So far we’ve considered the following design-relatedexplanations, as well as methods for addressing for themI Sampling bias - Simple random samplingI Confounding variables - Random assignment of the explanatory
variable, or stratificationI Other biases - Using placebo, double-blinding, proper
instruments, etc.
-
37/38
Study Design and Statistical Inference
I The overarching goal of a statistician is to rule out as manypossible explanations for an observed association as possible
I So far we’ve considered the following design-relatedexplanations, as well as methods for addressing for themI Sampling bias - Simple random samplingI Confounding variables - Random assignment of the explanatory
variable, or stratificationI Other biases - Using placebo, double-blinding, proper
instruments, etc.
-
38/38
Study Design and Statistical Inference
I As we discussed in Week 1 (babies choosing the helper orhinderer toys), when a study is well-designed the only viableexplanations for trends that are observed are:I Random chanceI The association is real
I We’ll soon learn how statisticians use probability theory todecide between these explanations
-
38/38
Study Design and Statistical Inference
I As we discussed in Week 1 (babies choosing the helper orhinderer toys), when a study is well-designed the only viableexplanations for trends that are observed are:I Random chanceI The association is real
I We’ll soon learn how statisticians use probability theory todecide between these explanations