measuring. before we begin... on a piece of paper give your best estimates: –how many airplanes...
TRANSCRIPT
42510011 0010 1010 1101 0001 0100 1011
Measuring
4251
0011 0010 1010 1101 0001 0100 1011
Before we begin . . .
• On a piece of paper give your best estimates:– How many airplanes will be in the
sky over the United States today?
– How many school buses are operational in the U.S.?
4251
0011 0010 1010 1101 0001 0100 1011
Measurement essentials
• Measurement is the value of a variable for a single element
– Systolic blood pressure is the variable
– 120 mmHg (millimeters of mercury) is the measurement
– Millimeters of mercury are the units• Validity
– Predictive validity• Accuracy
– Bias• Reliability (or Precision)
– Random error
4251
0011 0010 1010 1101 0001 0100 1011
Validity• A measurement is valid if it is an appropriate
representation of the property of interest• Suppose you learned that U of M graduated more students
who eventually became millionaires than either Carleton or St. Olaf. Would that be a fair comparison? How should the numbers be presented in order to make it a fair comparison?
• Often the rate (or percent) is more valid that a count of occurrences– August 4, 1998: Dow Jones drops 300 points, “the third biggest
drop ever” (Associate Press)– In fact the decline was 3.7%– There have been 215 bigger one-day percentage drops– Dow Jones is now high (about 9,000) because of many factors– Investors are more concerned with percentages, not points
4251
0011 0010 1010 1101 0001 0100 1011
Sport
Basket-
ball
Bicycle
riding
Baseball/
softball
Football
Soccer
Swim-
ming
Volley-
ball
Roller
skating
Weight-
lifting
Injuries
646,678
600,649
459,542
453,684
150,449
130,362
129,839
113,150
86,398
Rate
(per 1K)
1.8
7.1
7.1
30.3
1.6
1.8
3.7
3.0
0.6
Sport
Fishing
Horseback
riding
Skate-
boarding
Ice hockey
Golf
Tennis
Ice skating
Water
skiing
Bowling
People
(1,000s)
26,200
54,000
36,100
13,300
10,000
66,200
22,600
26,500
39,200
People
(1,000s)
47,000
10,100
8,000
1,800
24,700
16,700
7,900
9,000
40,400
Injuries
84,115
71,490
56,435
54,601
38,626
29,936
29,047
26,633
25,417
Rate
(per 1K)
24.7
11.1
12.7
34.1
15.0
2.0
5.7
4.3
2.2
4251
0011 0010 1010 1101 0001 0100 1011
Predictive validity• In social sciences often difficult to decide if a measurement is valid
– Are SATs a valid measure of college achievement?– Are IQs a valid measure of intelligence?
• A measurement has predictive validity if it can be used to adequately predict some outcome related to the property of interest
• How well do SATs predict college grades? – Actually, not so well! Studies show a weak correlation.– “Restricted range” problem: Most Carleton students have high SATs so
the correlation with college grades is lower than it would be if there was more low SAT scores!
• How would you measure the “well-being” of society?– Traverse City, Michigan’s “qualify of life” index counts bird and frog
species as a (partial) measure of the health of the environment– Tucson, Arizona counts pedestrians in its neighborhoods because people
feel safer when other people are on the streets
4251
0011 0010 1010 1101 0001 0100 1011
Accuracy and bias
• Measuring weight on a bathroom scale– Valid, but is it accurate?
• My scale is always off 5 pounds– Monday’s weight = true weight + 5 lbs + 0.25 lbs
– Tuesday’s weight = true weight + 5 lbs – 0.5 lbs
– Wednesday’s weight = true weight + 5 lbs + 0.75 lbs
• Two kinds of error: bias and random error• Measured value = true value + bias + random error
4251
0011 0010 1010 1101 0001 0100 1011
Accuracy of Measurement
• Measure the length of your (paperback) textbook to the nearest tenth of a unit.
• Do it on your own and don’t look at your neighbor’s answer
• Write down the result and hand it in
• Is the measure valid, biased, reliable?
• What do the data show? measurements.sav
4251
0011 0010 1010 1101 0001 0100 1011
Reliability/Precision
• A BIG idea: To improve reliability take averages of several measurements
• The average of several repeated measurements is less variable than a single measurement.
4251
0011 0010 1010 1101 0001 0100 1011
Apgar scores are a measurement of an infant’s overall health taken a few minutes after birth. The score ranges from 0 (dead) to 10 (“perfect health”) and is based on tests of the baby’s heart and breathing rate, muscle tone, etc. (APGAR stands for Activity, Pulse, Grimmace, Appearance, Respiration.)
A critic gives three reasons why the Apgar score isn’t a perfect measurement: Reason I – There are important facets of health that aren’t measured by the score. Reason II – A doctor’s rating may be affected by being present at the birth; often giving unwarranted low values to babies whose birth was difficult. Reason III – Two different doctors may give different Apgar scores, even when measuring the same baby at the same time.
Which of these criticisms argue about the validity of the Apgar score?Which of these criticisms argue about the reliability?Which of these criticisms argue about the bias in the measurement?
Suppose two doctors both judge an infant’s health using the Apgar system and the average of their two values is taken as the “official” Apgar score.
Will this improve the validity, reliability, and/or bias of the measurement?
4251
0011 0010 1010 1101 0001 0100 1011
• The diameter of the moon is measured four times independently by a process that is free of bias. The measurements came out 2157, 2166, 2162, and 2155 miles, which average out to 2160 miles. One more measurement is about to be taken using the same process. When compared with the estimate of 2160 miles, you would expect this next measurement to be [ more, just as, less ] accurate as a measure of the true diameter of the moon.
• The age of a pine tree was measured five times using a new electronic probe inserted in the tree’s trunk. The measured values were 43, 40, 45, 44, and 41 years old. Later this tree was cut down and by counting the growth rings, it was determined that the tree was really 34 years old. Does this new device for measuring the age of trees have a greater problem with bias or with precision?
4251
0011 0010 1010 1101 0001 0100 1011
What’s a valid measure of the effectiveness of cancer treatment?
• Total deaths from cancer– 1970: 331,000– 1990: 505,000– 1998: 539,000
• Percent of all Americans who die from cancer– 1970: 17.2%– 1990: 23.5%– 1998: 23.0%
• Percent of cancer patients who survive for 5 years from the time disease was discovered (5 year survival rate)– 1974-76: 50.3%– 1989-95: 60.9%