Chapters 12 & 13-Sampling & Designs of Surveys ExperimentsObjective: To understand the various types of experimental designs and techniques
Warm-upVarious claims are often made for surveys. Why is each of the following claims incorrect? Stopping college students on their way out of the campus
cafeteria is a good way to sample if we want to know about the quality of the food there.
We drew a sample of 100 from 3000 students in a school. To get the same level of precision for a town of 30,000 residents, we’ll need a sample of 1000.
A poll taken at a statistics support website garnered 12,357 responses. The majority said they enjoy doing statistics homework. With a sample size that large, we can be pretty sure that most Statistics students feel this way.
In the last example, the true percentage of all Statistics students who enjoy homework is called a “population statistic.”
4 Ways to Collect Data Observational study – observe and
measure specific characteristics, but we don’t attempt to modify the subject being studied
Experiment – a treatment is applied to observe its effect on the subjects
Simulation – mathematical or physical model used to reproduce a situation
Survey – investigation of characteristics of a population
4 Ways to Collect Data Examples
A study of the effect of changing flight patterns on the number of airplane accidents
A study of the effect of eating oatmeal on lowering blood pressure
A study showing how fourth grade students solve a puzzle
A study of US residents’ approval rating of the US president
Basic Steps to Designing Experiments
Identify the objective
Collect sample data
Use a random procedure that avoids bias
Analyze the data and form conclusions
Ways to Control Treatments
Placebo – a faux treatment that looks like the real treatment (i.e. sugar pill)
Placebo Effect – occurs when an untreated subject incorrectly believes that he/she is receiving a treatment and reports an improvement in symptoms.
Ways to Control Treatments (continued)
Blinding – a technique in which the subject doesn’t know whether he/she is receiving a treatment or a placebo.
This is used so we can determine if the treatment effect is significantly greater than the placebo effect Single-Blind – the researcher knew which
subject received which treatment, but the subjects did not know
Double-Blind – neither the researcher nor the subjects know who received a placebo or treatment
Ways to Control Treatments (continued)
Block – a group of subjects (or experimental units) that are similar to test the effectiveness of one or more treatments
The groups need to be similar in the ways that might affect the outcome.
Randomized design – this is a way to assign subjects to blocks through random selection
Randomly assigning treatments or placebos to groups
Controlled design – experimental units are carefully chosen so that the subject in each block are similar in the ways that are important.
Testing the use of a nicotine patch on groups of similar age and gender. A heavy smoker in her 40’s gets the treatment, and a similar heavy smoker in her 40’s gets a placebo.
Ways to Control Treatments (continued)
Confounding – occurs in an experiment when the effects from two or more variables cannot be distinguished from each other
Example – A professor in Vermont experiments with a new attendance policy (your course average drops one letter grade for each class cut), but an exceptionally mild winter moderates the discomforts that have reduce attendance in the past. If attendance improves, we can’t tell whether it was because of the new attendance policy or due to weather conditions.
Ways to Control Treatments (continued)
Sample Size Make sure your sample is large enough,
however, an extremely large sample is not necessarily a good sample.
Make sure the sample is large enough to see the true nature of the effects
Replication Replication helps to confirm results by
repeating the experiment
Ways to Control Treatments (continued)
Randomization Collect data in an appropriate way, otherwise
your data are useless.
Random Sample – members of the population are selected in a way that each has an equal chance of being selected
Sampling Techniques To select a sample at random, we first
need to define where the sample will come from.
The sampling frame is a list of individuals from which the sample is drawn.
Once we have our sampling frame, the easiest way to choose an SRS is to assign a random number to each individual in the sampling frame.
Sampling TechniquesSimple Random Sample (SRS) – n subjects are selected in a way that every possible sample of size n has the same chance of being chosen.
Steps in simple random sampling1. Identify and define the population.2. Determine the sample size.3. List all members of the population.4. Assign each member of the population a consecutive number from
zero to the desired sample size (i.e. 00 to 35 – each member needs to have a number with the same number of digits).
5. Select an arbitrary starting number from the random number table.6. Look for the subject who was assigned that number. If there is a
subject with that assigned number, they are in the sample.7. Look at the next number in the random number table and repeat
steps 6 and 7 until the appropriate number of participants has been selected.
SRS Example:
Randomly assign 5 members to participate in Ms. Halliday’s experiment using Line 21 of Appendix G:
Kesley Jake Sean
John Anne Derek
Joe Dan Alyssa
Sarah Katie Tara
Tandi Hallie Karen
Bella Corey Jim
Sampling Techniques Systematic Sampling – randomly select a starting
point through a random # generator, see calculator or software, and take every kth subject of the population
Steps in systematic sampling1. Identify and define the population.2. Determine the sample size.3. List all members of the population.4. Determine K by dividing the number of members in the
population by the desired sample size. 5. Choose a random starting point in the population list.6. Starting at that point in the population, select every Kth
name on the list until the desired sample size is met. 7. If the end of the list is reached before the desired sample
size is drawn, go to the top of the list and continue.
Sampling Techniques Stratified Sampling – we subdivide the
population into at least two different subgroups (or strata) that share the same characteristics (such as age or gender), then draw a sample from each stratum
Steps in stratified random sampling1. Identify and define the population.2. Determine the sample size.3. Identify variable and strata for which equal
representation is desired.4. Classify all members of the population as a member
of one strata.5. Choose the desired number of subjects from each
strata using the simple random sampling technique.
Day 2
Stratified Example:A math club has 30 students and 10 faculty members. The students are:
Abel Ellis Huber Miranda CarsonGhosh Reinman Jimenez Chen Moskowitz
Santos GriswoldDavid Jones Neyman Deming HeinKimFisher Thompson HernandezPearl Utz HollandPotter Verani Shaw O’Brien Klotz Liu
The faculty members are:Krauland Karkaria Graham Semega Walker
KeffalasMagill Magnani HallidayKotula
The club can send 4 students and 2 faculty members to the convention. Use a random # generator.
Sampling Techniques Cluster Sampling – first divide the population area into
sections (or clusters), then randomly select some of those clusters, and then choose ALL members from those selected clusters.
Steps in cluster sampling1. Identify and define the population.2. Determine the sample size.3. Identify and define a cluster (neighborhood, classroom, city block)4. List all clusters.5. Estimate the average number of population members per cluster.6. Determine the number of clusters needed.7. Choose the desired number of clusters using the simple random
sampling technique. 8. All population members in the included clusters are part of the
sample.
Sampling Techniques
Convenience Sampling – a researcher chooses a sample that is convenient or easy for them to access. Convenience sampling is a BAD sampling technique, as
the sample is rarely representative of the population of interest.
Sadly, it is commonly used among inexperienced researchers.
Example – Ms. Halliday is conducting research at Pitt. She needs a sample of students and chooses all of her CHS Statistics students to participate in her study.
Sampling Techniques Examples: You select a class at random and question each
student in the class.
You divide the student population with respect to majors and randomly select and question some students in each major.
You question every 20th student you see in the hall.
You assign each student a number and generate random numbers. You then question each student whose number is randomly selected.
Multistage Sampling Sometimes we use a variety of sampling
methods together.
Sampling schemes that combine several methods are called multistage samples.
EXAMPLE: Most surveys conducted by professional polling organizations use some combination of stratified and cluster sampling as well as simple random sampling.
Sampling Error Sampling Error – the difference
between a sample result and the true population result; such as an error results from chance sample fluctuations
Non-Sampling Error – occurs when the sample data are incorrectly collected, recorded, or analyzed (such as selecting a biased sample, using a defective measurement instrument, or copying the data incorrectly)
The Valid Survey It isn’t sufficient to just draw a sample and
start asking questions. A valid survey yields the information we are seeking about the population we are interested in. Before you set out to survey, ask yourself: What do I want to know?
Am I asking the right respondents?
Am I asking the right questions?
What would I do with the answers if I had them; would they address the things I want to know?
The Valid Survey (cont.)These questions may sound obvious, but there
are a number of pitfalls to avoid.
Know what you want to know. Understand what you hope to learn and from
whom you hope to learn it.
Use the right frame. Be sure you have a suitable sampling frame.
Tune your instrument. The survey instrument itself can be the source
of errors - too long yields less responses.
The Valid Survey (cont.) Ask specific rather than general questions.
Ask for quantitative results when possible.
Be careful in phrasing questions. A respondent may not understand the question or
may understand the question differently than the way the researcher intended it.
Does family mean immediate or extended?
Even subtle differences in phrasing can make a difference. See example on page 281 about 9/11
The Valid Survey (cont.) Be careful in phrasing answers.
It’s often a better idea to offer choices rather than inviting a free response.Think about how difficult it would be to
organize the various responses!
The Valid Survey (cont.) The best way to protect a survey
from unanticipated measurement errors is to perform a pilot survey.
A pilot is a trial run of a survey you eventually plan to give to a larger group.
Bad Sampling! Sample Badly with Volunteers:
In a voluntary response sample, a large group of individuals is invited to respond, and all who do respond are counted.
Voluntary response samples are almost always biased, and so conclusions drawn from them are almost always wrong.
Voluntary response samples are often biased toward those with strong opinions or those who are strongly motivated.
Since the sample is not representative, the resulting voluntary response bias invalidates the survey.
Bad Sampling! (cont.) Sample Badly, but Conveniently:
In convenience sampling, we simply include the individuals who are convenient.
Unfortunately, this group may not be representative of the population. Sample from a Bad Sampling Frame:
An SRS from an incomplete sampling frame introduces bias because the individuals included may differ from the ones not in the frame.
In telephone surveys, people who have only cell phones or internet phones are often missing from the sampling frame.
Undercoverage: Many of these bad survey designs suffer from
undercoverage, in which some portion of the population is not sampled at all or has a smaller representation in the sample than it has in the population.
Undercoverage can arise for a number of reasons, but it’s always a potential source of bias.
Bad Sampling! (cont.) Watch out for nonrespondents.
A common and serious potential source of bias for most surveys is nonresponse bias.
No survey succeeds in getting responses from everyone. The problem is that those who don’t
respond may differ from those who do.And they may differ on just the variables we
care about. Work hard to avoid influencing responses.
Response bias refers to anything in the survey design that influences the responses.
For example, the wording of a question can influence the responses:
Book Assignment
Day 1 Chapter 13: pp. 312-316
# 1, 2, 4 #5-20 –answer ONLY question “a” for each with
either of these responses: experiment, observational study
Day 2 Chapter 12: pp. 288-291
# 1, 3-6, 8, 9, 15-17, 23, 25
Please check your answers on the solutions manual posted online.