statistics the science of collecting, analyzing, and interpreting data. planning a study using the...
TRANSCRIPT
StatisticsThe science of collecting, analyzing, and
interpreting data.
Planning A Study Using The Statistical Problem Solving Process:
1.Ask a question of interest
2.Collect some data
3.Analyze and describe the data
4.Make a conclusion, answering the question of interest
2 Types of StudiesObservational Study
Experimental Study
-Record data observed or surveyed-No treatments imposed-Used to describe a group or situation
-Impose treatments on subjects-Record results and compare groups-Used to see if the treatments cause a change in the response
Measuring Data from Study Subjects or Experimental Units
Various Variables
Explanatory (independent, x) variable: the treatment in an experiment or group label in an observation (may not exist in observational studies)
Response (dependent, y) variable: the result measured in the end of every experimental and observational study
Confounding variable: a variable that might exist in a study that influences the response but can’t be separated from the explanatory variable
Example of confounding
A study sites that a group of children who had certain vaccinations were more likely to develop autism than a group of children who did not receive those same vaccinations.
Does this mean that vaccinations cause autism?
Explanatory:
Response:
Possible confounding:
Effect of confounding:
Whether or not they were vaccinated
Whether or not they developed autismVaccination group could have also given children some new diet or supplement that non-vaccination group didn’t give
Vaccination group’s higher rate of autism may be tied to diet or supplement rather than vaccination
CensusThe systematical collection
of data on every single subject in the population.
When the population is large, it will be time consuming and expensive.
Video on census/American Community Survey use at Target:
http://www.census.gov/multimedia/www/videos/stats_in_action.php?intcmp=sldr4
Difference between ACS and Current Population Survey:http://www.census.gov/people/laborforce/publications/ACS-
CPS_Comparison_Report.pdf
Observational StudiesSubjects are randomly
selected and asked questions or observed in a particular setting.
Subjects are not influenced in how they respond.
Good Survey Questions
Avoid unnecessary complexity to question
Avoid misleading questions Randomize ordering of
questions Ensure confidentialityAvoid influencing the subject
by tone, appearance, or suggestion
http://www.learner.org/vod/vod_window.html?pid=152Video 17, start at 4:46, 2.5 min
Sources of bias in surveysIf a selection process consistently obtains
values too high or too low, then bias exists. Some group may be under (or over) represented.
Response Bias: influencing the response in some way
-Non-response bias: a group is left out because they feel uncomfortable, too busy, etc.
Selection Bias : not randomly selected from the entire population of interest
Sampling Vocabulary• Population of Interest the set of people or things
you wish to know something about• Sampling frame a list of all subjects from which
the sample is taken– What is the difference between the sampling frame and
the population of interest?
• Sample a portion of the population that is selected to represent the population of interest
• Random sampling a way of getting a sample that reduces selection bias – How could we ensure a sample is randomly selected?
PopulationPopulation
Random SelectionRandom Selection
SampleSample
Sampling Methods
Simple Random Sample (SRS)
Stratified Random Sampling
Cluster Sampling
Systematic Sampling
Multi-Stage Sampling
Random Digit Dialing
Self-Selected Sample
Convenience Sample
Judgment Sample
“Quickie Polls”
Simple Random Sampling From the entire population
every unit has the same chance of belonging to the sample
and every possible grouping of specified size has same chance of being selected.
Like drawing names out of a hat
Stratified Sample vs. Cluster Samplesome from all all from some
1st divide population into groups (cluster), then randomly select some clusters and sample everyone in that cluster
(all slips from one or two hats)
1st divide population into groups (strata), then take a Simple Random Sample from each strata
(one or more slips from each hat)
Systematic Sampling
From a list, randomly choose starting point (4th entry), and divide into consecutive segments (every 10 names), then sample at that same point in each segment (4, 14, 24, 34,…)
Sample that approximates a SRS of all households that have telephones with a specific exchange
(512-266-)
Random Digit Dialing
Pew Research: http://www.people-press.org/methodology/sampling/random-digit-dialing-our-standard-method/
Self-Selected Sample--radio station call-inConvenience Sample--surveying folks in a
mall who appear willing to talk to youJudgment Sample – surveying those you
pick as an “expert” selector“Quickie Polls”--hastily designed, poorly
pre-tested, one night survey sample for evening news show
Samples typically resulting in biased results
Random Number Table• 19223 95034 05756 28713 96409
12531 42544 82853 73676 47150• Assign a number label to each unit in the
population• Read numbers from table from left to right,
starting anywhere. The subjects selected for the sample are those read from the table.
• Repeats or those not a part of the list are ignored.
Sampling & Lays potato chips
• http://www.learner.org/vod/vod_window.html?pid=152
• Video 16, start at 6:35, about 2 minutes
• Nielsen tv ratings
• http://www.nielsen.com/us/en/nielsen-solutions/nielsen-measurement/nielsen-tv-measurement.html