chapter 4 gathering data
DESCRIPTION
Chapter 4 Gathering data. Learn …. How to gather “good” data About Experiments and Observational Studies. Section 4.1. Should We Experiment or Should we Merely Observe?. Population, Sample and Variables. Population : all the subjects of interest Sample : subset of the population - - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/1.jpg)
Agresti/Franklin Statistics, 1 of 56
Chapter 4Gathering data
Learn ….
How to gather “good” data
About Experiments and Observational Studies
![Page 2: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/2.jpg)
Agresti/Franklin Statistics, 2 of 56
Section 4.1
Should We Experiment or Should we Merely Observe?
![Page 3: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/3.jpg)
Agresti/Franklin Statistics, 3 of 56
Population, Sample and Variables
Population: all the subjects of interest
Sample: subset of the population -data is collected on the sample
Response variable: measures the outcome of interest
Explanatory variable: the variable that explains the response variable
![Page 4: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/4.jpg)
Agresti/Franklin Statistics, 4 of 56
Types of Studies
Experiments
Observational Studies
![Page 5: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/5.jpg)
Agresti/Franklin Statistics, 5 of 56
Experiment
A researcher conducts an experiment by assigning subjects to certain experimental conditions and then observing outcomes on the response variable
The experimental conditions, which correspond to assigned values of the explanatory variable, are called treatments
![Page 6: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/6.jpg)
Agresti/Franklin Statistics, 6 of 56
Observational Study
In an observational study, the researcher observes values of the response variable and explanatory variables for the sampled subjects, without anything being done to the subjects (such as imposing a treatment)
![Page 7: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/7.jpg)
Agresti/Franklin Statistics, 7 of 56
Example: Does Drug Testing Reduce Students’ Drug Use?
Headline: “Student Drug Testing Not Effective in Reducing Drug Use”
Facts about the study:
• 76,000 students nationwide
• Schools selected for the study included schools that tested for drugs and schools that did not test for drugs
• Each student filled out a questionnaire asking about his/her drug use
![Page 8: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/8.jpg)
Agresti/Franklin Statistics, 8 of 56
Example: Does Drug Testing Reduce Students’ Drug Use?
![Page 9: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/9.jpg)
Agresti/Franklin Statistics, 9 of 56
Example: Does Drug Testing Reduce Students’ Drug Use?
Conclusion: Drug use was similar in schools that tested for drugs and schools that did not test for drugs
![Page 10: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/10.jpg)
Agresti/Franklin Statistics, 10 of 56
Example: Does Drug Testing Reduce Students’ Drug Use?
What were the response and explanatory variables?
![Page 11: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/11.jpg)
Agresti/Franklin Statistics, 11 of 56
Example: Does Drug Testing Reduce Students’ Drug Use?
Was this an observational study or an experiment?
![Page 12: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/12.jpg)
Agresti/Franklin Statistics, 12 of 56
Advantages of Experiments over Observational Studies
We can study the effect of an explanatory variable on a response variable more accurately with an experiment than with an observational study
An experiment reduces the potential for lurking variables to affect the result
![Page 13: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/13.jpg)
Agresti/Franklin Statistics, 13 of 56
Experiments vs Observational Studies
When the goal of a study is to establish cause and effect, an experiment is needed
There are many situations (time constraints, ethical issues,..) in which an experiment is not practical
![Page 14: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/14.jpg)
Agresti/Franklin Statistics, 14 of 56
Good Practices for Using Data
Beware of anecdotal data
Rely on data collected in reputable research studies
![Page 15: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/15.jpg)
Agresti/Franklin Statistics, 15 of 56
Example of a Dataset
General Social Survey (GSS): • Observational Data Base
• Tracks opinions and behaviors of the American public
• A good example of a sample survey
• Gathers information by interviewing a sample of subjects from the U.S. adult population
• Provides a snapshot of the population
![Page 16: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/16.jpg)
Agresti/Franklin Statistics, 16 of 56
Section 4.2
What Are Good Ways and Poor Ways to Sample?
![Page 17: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/17.jpg)
Agresti/Franklin Statistics, 17 of 56
Setting Up a Sample Survey
Step 1: Identify the Population
Step 2: Compile a list of subjects in the population from which the sample will be taken. This is called the sampling frame.
Step 3: Specify a method for selecting subjects from the sampling frame. This is called the sampling design.
![Page 18: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/18.jpg)
Agresti/Franklin Statistics, 18 of 56
Random Sampling
Best way of obtaining a representative sample
The sampling frame should give each subject an equal chance of being selected to be in the sample
![Page 19: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/19.jpg)
Agresti/Franklin Statistics, 19 of 56
Simple Random Sampling
A simple random sample of ‘n’ subjects from a population is one in which each possible sample of that size has the same chance of being selected
![Page 20: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/20.jpg)
Agresti/Franklin Statistics, 20 of 56
Example: Sampling Club Officers for a New Orleans Trip
The five offices: President, Vice-President, Secretary, Treasurer and Activity Coordinator
The possible samples are:
(P,V) (P,S) (P,T) (P,A) (V,S)
(V,T) (V,A) (S,T) (S,A) (T,A)
![Page 21: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/21.jpg)
Agresti/Franklin Statistics, 21 of 56
The possible samples are: (P,V) (P,S) (P,T) (P,A) (V,S) (V,T) (V,A) (S,T) (S,A) (T,A)
What are the chances the President and Activity Coordinator are selected?
a. 1 in 5
b. 1 in 10
c. 1 in 2
![Page 22: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/22.jpg)
Agresti/Franklin Statistics, 22 of 56
Selecting a Simple Random Sample
Use a Random Number Table
Use a Random Number Generator
![Page 23: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/23.jpg)
Agresti/Franklin Statistics, 23 of 56
Methods of Collecting Data in Sample Surveys
Personal Interview
Telephone Interview
Self-administered Questionnaire
![Page 24: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/24.jpg)
Agresti/Franklin Statistics, 24 of 56
How Accurate Are Results from Surveys with Random Sampling?
Sample surveys are commonly used to estimate population percentages
These estimates include a margin of error
![Page 25: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/25.jpg)
Agresti/Franklin Statistics, 25 of 56
Example: Margin of Error
A survey result states: “The margin of error is plus or minus 3 percentage points”
This means: “It is very likely that the reported sample percentage is no more than 3% lower or 3% higher than the population percentage”
Margin of error is approximately:1
100%n
![Page 26: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/26.jpg)
Agresti/Franklin Statistics, 26 of 56
Be Wary of Sources of Potential Bias in Sample Surveys
A variety of problems can cause responses from a sample to tend to favor some parts of the population over others
![Page 27: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/27.jpg)
Agresti/Franklin Statistics, 27 of 56
Types of Bias in Sample Surveys
Sampling Bias: occurs from using nonrandom samples or having undercoverage
Nonresponse bias: occurs when some sampled subjects cannot be reached or refuse to participate or fail to answer some questions
Response bias: occurs when the subject gives an incorrect response or the question is misleading
![Page 28: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/28.jpg)
Agresti/Franklin Statistics, 28 of 56
Poor Ways to Sample
Convenience Sample: a sample that is easy to obtain
• Unlikely to be representative of the population
• Severe biases my result due to time and location of the interview and judgment of the interviewer about whom to interview
![Page 29: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/29.jpg)
Agresti/Franklin Statistics, 29 of 56
Poor Ways to Sample
Volunteer Sample: most common form of convenience sample• Subjects volunteer for the sample
• Volunteers are not representative of the entire population
![Page 30: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/30.jpg)
Agresti/Franklin Statistics, 30 of 56
A Large Sample Does Not Guarantee An Unbiased Sample
Warning:
![Page 31: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/31.jpg)
Agresti/Franklin Statistics, 31 of 56
Section 4.3
What Are Good Ways and Poor Ways to Experiment?
![Page 32: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/32.jpg)
Agresti/Franklin Statistics, 32 of 56
An Experiment
Assign each subject (called an experimental unit ) to an experimental condition, called a treatment
Observe the outcome on the response variable
Investigate the association – how the treatment affects the response
![Page 33: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/33.jpg)
Agresti/Franklin Statistics, 33 of 56
Elements of a Good Experiment
Primary treatment of interest
Secondary treatment for comparison
Comparing the primary treatment results to the secondary treatment results help to analyze the effectiveness of the primary treatment
![Page 34: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/34.jpg)
Agresti/Franklin Statistics, 34 of 56
Control Group
Subjects assigned to the secondary treatment are called the control group
The secondary treatment could be a placebo or it could be an actual treatment
![Page 35: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/35.jpg)
Agresti/Franklin Statistics, 35 of 56
Randomization in an Experiment
It is important to randomly assign subjects to the primary treatment and to the secondary (control) treatment
Goals of randomization: • Prevent bias
• Balance the groups on variables that you know affect the response
• Balance the groups on lurking variables that may be unknown to you
![Page 36: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/36.jpg)
Agresti/Franklin Statistics, 36 of 56
Blinding the Study
Subjects should not know which group they have been assigned to – the primary treatment group or the control group
Data collectors and experimenters should also be blind to treatment information
![Page 37: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/37.jpg)
Agresti/Franklin Statistics, 37 of 56
Example: A Study to Assess Antidepressants for Quitting Smoking
Design:
• 429 men and women
• Subjects had smoked 15 cigarettes or more per day for the previous year
• Subjects were highly motivated to quit
![Page 38: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/38.jpg)
Agresti/Franklin Statistics, 38 of 56
Example: A Study to Assess Antidepressants for Quitting Smoking
Subjects were randomly assigned to one of two groups:• One group took an antidepressant daily
• Second group did not take the antidepressant (this group is called the placebo group)
![Page 39: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/39.jpg)
Agresti/Franklin Statistics, 39 of 56
Example: A Study to Assess Antidepressants for Quitting Smoking
The study ran for one year
At the end of the year, the study observed whether each subject had successfully abstained from smoking or had relapsed
![Page 40: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/40.jpg)
Agresti/Franklin Statistics, 40 of 56
Example: A Study to Assess Antidepressants for Quitting Smoking
Results after 1 year:
• Treatment Group: 55.1% were not smoking
• Placebo Group: 42.3% were not smoking
Results after 18 months:• Antidepressant Group: 47.7% not smoking
• Placebo Group: 37.7% not smoking
Results after 2 years:• Antidepressant Group: 41.6% not smoking
• Placebo Group: 40% not smoking
![Page 41: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/41.jpg)
Agresti/Franklin Statistics, 41 of 56
Example: A Study to Assess Antidepressants for Quitting Smoking
Question to Think About: Are the differences between the two groups statistically significant or are these differences due to ordinary variation?
![Page 42: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/42.jpg)
Agresti/Franklin Statistics, 42 of 56
Section 4.4
What Are Other Ways to Conduct Experimental and Observational
Studies?
![Page 43: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/43.jpg)
Agresti/Franklin Statistics, 43 of 56
Multifactor Experiments
Multifactor Experiments: have more than one categorical explanatory variable (called a factor).
![Page 44: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/44.jpg)
Agresti/Franklin Statistics, 44 of 56
Example: Do Antidepressants and/or Nicotine Patches Help Smokers Quit?
![Page 45: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/45.jpg)
Agresti/Franklin Statistics, 45 of 56
Matched-Pairs Design
Each subject serves as a block
Both treatments are observed for each subject
![Page 46: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/46.jpg)
Agresti/Franklin Statistics, 46 of 56
Example: A Study to Compare an Oral Drug with a Placebo for Treating Migraine Headaches
Subject Drug Placebo
1 Relief No Relief First matched pair
2 Relief Relief
3 No Relief No Relief
![Page 47: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/47.jpg)
Agresti/Franklin Statistics, 47 of 56
Blocks and Block Designs
Block: collection of experimental units that have the same (or similar) values on a key variable
Block Design: identifies blocks before the start of the experiment and assigns subjects to treatments with in those blocks
![Page 48: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/48.jpg)
Agresti/Franklin Statistics, 48 of 56
Experiments vs Observational Studies
An Experiment can measure cause and effect
An observational study can yield useful information when an experiment is not practical
An observational study is a practical way of answering questions that do not involve trying to establish causality
![Page 49: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/49.jpg)
Agresti/Franklin Statistics, 49 of 56
Observational Studies
A well-designed and informative observational study can give the researcher very useful data.
Sample surveys that select subjects randomly are good examples of observational studies.
![Page 50: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/50.jpg)
Agresti/Franklin Statistics, 50 of 56
Random Sampling Schemes
Simple Random Sample: every possible sample has the same chance of selection
![Page 51: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/51.jpg)
Agresti/Franklin Statistics, 51 of 56
Random Sampling Schemes
Cluster Random Sample: • Divide the population into a large number
of clusters
• Select a sample random sample of the clusters
• Use the subjects in those clusters as the sample
![Page 52: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/52.jpg)
Agresti/Franklin Statistics, 52 of 56
Random Sampling Schemes
Stratified Random Sample:• Divide the population into separate groups,
called strata
• Select a simple random sample from each strata
![Page 53: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/53.jpg)
Agresti/Franklin Statistics, 53 of 56
Observational Studies
Well-designed observational studies use random sampling schemes
![Page 54: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/54.jpg)
Agresti/Franklin Statistics, 54 of 56
Retrospective and Prospective Studies
Retrospective study: looks into the past
Prospective study: follows its subjects into the future
![Page 55: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/55.jpg)
Agresti/Franklin Statistics, 55 of 56
Case-Control Study
A case-control study is an observational study in which subjects who have a response outcome of interest (the cases) and subjects who have the other response outcome (the controls) are compared on an explanatory variable
![Page 56: Chapter 4 Gathering data](https://reader035.vdocuments.site/reader035/viewer/2022062309/56815a9d550346895dc82123/html5/thumbnails/56.jpg)
Agresti/Franklin Statistics, 56 of 56
Example: Case-Control Study
Response outcome of interest: Lung cancer
• The cases have lung cancer
• The controls did not have lung cancer
The two groups were compared on the explanatory variable:• Whether the subject had been a smoker