Statistic for the day:Portion of all international arms sales since 1980 that went to the
middle East: 2 out of 5
Assignment:Assignment:
Read Chapter 12 pp. 195-203Read Chapter 12 pp. 195-203
Do the exercises at the end of this lecture.Do the exercises at the end of this lecture.
(Answers will be given in Fri. lecture.)(Answers will be given in Fri. lecture.)
Source: US Arms Control and Disarmament Agency
These slides were created by Tom Hettmansperger and in some cases modified by David Hunter
serving calories
1 reg roast beef 5.5oz 383 2 beef and cheddar 6.9 508 3 junior roast beef 3.1 233 4 super roast beef 9.0 552 5 giant roast beef 8.5 544 6 chicken breast fillet 7.2 445 7 grilled chicken deluxe 8.1 430 8 French dip 6.9 467 9 Italian sub 10.1 660 10 roast beef sub 10.8 672 11 turkey sub 9.7 533 12 light roast beef deluxe 6.4 294 13 light roast turkey deluxe 6.8 260 14 light roast chicken deluxe 6.8 276
Arby’s
Research Question: At Arby’s are calories related to the size of the sandwich?
Observational study•Response = calories•Explanatory variable = small or large sandwich
Small sandwich means less than 7 oz (n = 7)
Large sandwich means more than 7 oz (n = 7)
Observational study
•Response = calories
•Explanatory variable = small or large sandwich
THE RESPONSE VARIABLE IS A MEASUREMENT VARIABLE.
THE EXPLANATORY VARIABLE IS A CATEGORICAL
VARIABLE.
largesmall
700
600
500
400
300
200
calo
ries
large = greater than 7 ozsmall = less than 7 oz
Arby's Sandwiches
There seems to be a difference. (Is it statistically significant?)
We can refine the explanatory variable and get moreinformation about the relationship between caloriesand serving (sandwich) size.
Rather than split it into small and large
Keep the numerical values of the explanatory variable.
Observational studyResponse = caloriesExplanatory variable = size of the sandwich ( in oz.)
BOTH RESPONSE AND EXPLANATORYVARIANBLES ARE MEASUREMENTVARIABLES.
(NEITHER IS A CATEGORICAL VARIABLE)
3 4 5 6 7 8 9 10 11
200
300
400
500
600
700
serving
calo
ries
calories = -10.2 + 60.5x(serving)
S = 78.5202 R-Sq = 72.2 % R-Sq(adj) = 69.8 %
Arby's
Correlation = .83
Best fitting line through the data: called the REGRESSION LINEStrength of relationship: measured by CORRELATON
Can we have two categorical variables?
Recall we split the explanatory variable at 7 oz.So we defined small as less than 7 oz andlarge to be greater than 7 oz.
Next we split the response variable at 456 calories.Then we define low calories as less than 456 andhigh calories to be greater than 456.
Data:LowLow HighHigh
SmallSmall 55 22 77
LargeLarge 22 55 77
77 77 1414
Response:Calories
Explanatory:Size
Low High
Small 5 2 7
Large 2 5 7
7 7 14
Low High
Small .72 .28
Large .28 .72
Low High
Small 72% 28%
Large 28% 72%
Response: calories
Explanatory: size
Question: Is 72% - 28% = 44% significant?
Proportions Percentages
Two categorical variables: Explanatory variable: GenderResponse variable: Body Pierced or Not
Survey question:Have you pierced any other part of your body?(Except for ears)
Research Question: Is there a significant difference between women and men in terms of body pierces?
Data:
NoNo YesYes
WomenWomen 8484 5151 135135
MenMen 9696 33 9999
180180 5454 234234
Pierced?
Gender?
Explanatory
Response
From Stat 100.2, spring 2004 (missing responses omitted)
Response: body pierced? no yes All female 62.22 37.78 100.00 male 96.97 3.03 100.00
All 76.92 23.08 100.00
Percentages
Research question: Is there a significant differenceBetween women and men? (i.e., between 62.22% and 96.97%)
62.22 = 84/13596.97 = 96/99
Rows: gender Columns: body no yes All female 84 51 135 62.22 28.57 100.00 ------------------------------- male 96 3 99 96.97 5.56 100.00 All 180 54 234 76.92 23.08 100.00
Are the differences between females and males significant? (i.e., between 62.22% and 96.97%)
Counts and percentages
The Debate:
The research advocate claims that there is a significant difference.
The skeptic claims there is no real difference. The data differences simply happen by chance.
The strategy for determining statistical significance: First, figure out what you expect to see if there is First, figure out what you expect to see if there is
no difference between females and malesno difference between females and males Second, figure out how far the data is from what is Second, figure out how far the data is from what is
expected.expected. Third, decide if the distance in the second step is Third, decide if the distance in the second step is
large.large. Fourth, if large then claim there is a statistically Fourth, if large then claim there is a statistically
significant difference.significant difference.
Research Advocate: OK. Suppose there is really no difference in the population as you, the Skeptic,claim. We will compare what you, The Skeptic, expect to see and what you actually do see in the data.
Skeptic: How do we figure out what we expect to see?
Rows: gender Columns: body pierces no yes All female 84 51 135 male 96 3 99 All 180 54 234
Actual data:
Rows: gender Columns: body pierces top lines of numbers are observed bottom lines are expected no yes All female 84 51 135 103.85 31.15 135.00 male 96 3 99 76.15 22.85 99.00 All 180 54 234 180.00 54.00 234.00
How to measure the distance between what theresearch advocate observes in the table and what the skeptic expects:
Add up the following for each cell:
Now how do we decide if 38.85 is large or not? Ifit is large enough the skeptic concedes to the research advocate and agrees there is a statistically significant difference. How large is enough?
2( exp)
exp
obs
2 2 2 22 (84 103.85) (51 31.15) (96 76.15) (3 22.85)
38.85103.85 31.15 76.15 22.85
But our chi-squared is 38.85 so the research advocate easily wins!There is a statistically significant difference between men and women.
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
2.0
95% onthis side
5% onthis side
Cutoff=3.84
Chi-squared distribution with 1 degree of freedom:
If chi-squared statistic is larger than 3.84, it is declared large and the research advocate wins.
Rows: gender Columns: cell phone no yes All female 12 124 136 male 14 87 101
All 26 211 237
Exercise: Follow the 4 steps and answer theResearch Question: Is there a relationship between gender and ownership of cell phones in Stat 100.2?
Data
Exercise:
Follow the 4 steps and answer the research Follow the 4 steps and answer the research question: Is there a statistically significant question: Is there a statistically significant difference in calories between small and difference in calories between small and large sandwiches? Data on slide #12.large sandwiches? Data on slide #12.