![Page 1: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/1.jpg)
Stat 217 – Day 25
Regression
![Page 2: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/2.jpg)
Last Time - ANOVA
When? Comparing 2 or means (one categorical and one
quantitative variable) Research question
Null hypothesis: 1= 2 = … = I (no association between the two variables)
Alternative hypothesis: at least one differs (there is an association between the two variables)
![Page 3: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/3.jpg)
Example (with 3 groups…)
Not significant Significant
![Page 4: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/4.jpg)
How? Compare differences in means vs. the natural
variability in the data (s) Compare test statistic to F distribution, p-value Output: test statistic, p-value, ratio of variability
between groups to variability within groups Demo
Strong evidence (p-value = .03 < .05) that the type of disability affected the ratings, on average, of these 70 students
![Page 5: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/5.jpg)
Technical Conditions
Technical conditions: Randomness: random sampling or random
assignment Sample sizes: Normal populations Equal standard deviations: Check ratio of sample
standard deviations
Kinda need same shape and spread for a comparison of just means to be reasonable
![Page 6: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/6.jpg)
Technical Conditions
1) RandomnessRandom assignment
2) Each population follows a normal distribution
3) Each population has the same standard deviation1.794/1.482 < 2
![Page 7: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/7.jpg)
Summary: Comparing several groupsCategorical response H0: 1= 2 = … = I
Ha: at least one differs
Is test statistic large? Chi-square test
Expands 2 sample z-test
Quantitative response H0: 1= 2 = … = I
Ha: at least one differs
Is test statistic large? ANOVA
Expands 2 sample t-test
No association between variables
Is an association between variables
![Page 8: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/8.jpg)
Exam 2 comments
Pet owners and CPR(a) Make sure interpret the calculated interval
“55% of pet owners “ – sample or population?
(b) Technical conditions Using the ones for categorical data
(c) See whether .5 is inside CI
(d) Interpretation of p-value: chance of data at least this extreme if null hypothesis is true
(e) Why is sample size information important? Sampling variability
![Page 9: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/9.jpg)
Exam 2 comments
Anchoring(a) Make sure clear which is which
(b) “TC met”, TOS applet with 2 means
(c) Chicago average estimate is 51K to 1.6 million higher than Green Bay average (direction!)
(d) What does it mean to say it’s significant? What is the actual conclusion to the research question
![Page 10: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/10.jpg)
Exam 2 comments
Lab 6
![Page 11: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/11.jpg)
Exam 2 comments
Multiple choice1. B
2. C – either is possible
3. B – small p-value eliminates “random chance” as a plausible explanation
4. B – it’s only unusual if she’s guessing (7s and 11s are only unusual for fair dice)
Extra Credit More likely to get a value far from mean with
smaller sample size (e.g., n =1)
![Page 12: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/12.jpg)
Next Topic: Two quantitative variables Graphical summary Numerical summary Model to allow predictions Inference beyond sample data
![Page 13: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/13.jpg)
Activity 26-1 (p. 532)
Have a sample of 20 homes for sale in Arroyo Grande in 2007 Variable 1 = house price Variable 2 = house size
Is there a relationship between these 2 variables? Does knowing the house size help us predict its
price?
![Page 14: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/14.jpg)
1) Graphical summary: scatterplotPrice vs. size
1. DirectionPositive or negative?
2. StrengthHow closely follow the pattern
3. FormLinear?
![Page 15: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/15.jpg)
Describing Scatterplots
Activity 26-3 (p. 536)
Positive None Negative
Strong Weak Strong
DirectionStrengthForm: Linear or not
![Page 16: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/16.jpg)
2) Numerical summary: Correlation coefficient (Act 27-1)
.994 .889 .510 -.081 -.450 -.721 -.907
![Page 17: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/17.jpg)
Temperatures vs. Month
Direction: positive then negative Form: nonlinear Strength: very strong
r = .257
![Page 18: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/18.jpg)
Example 1: Price vs. Size
r = .780
What do you learn from these numerical and graphical summaries?
![Page 19: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/19.jpg)
Turn in, with partner Activity 26-6
parts b, c, and e
For Thursday Pre-lab for Lab 9
For Monday Activity 26-7 HW 7
![Page 20: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/20.jpg)
2) Guess the correlation
Applet
![Page 21: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/21.jpg)
3) Model
IF it is linear, what line best summarizes the relationship? Demo
Moral: The “least squares regression line” minimizes the sum of the squared residuals
![Page 22: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/22.jpg)
Interpreting the equation (p. 577)
a = intercept, b = slope Slope = predicted change in response associated
with a one-unit increase in the explanatory Intercept = predicted value of response when
explanatory variable = 0
bxay ˆ Explanatory variableResponse variable
![Page 23: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/23.jpg)
3) Model?
Price-hat = 265222 + 169 size Slope = each additional square foot in house size
is associated with a $169 increase in predicted price (price per foot) Be a little careful here, don’t sound too “causal” I really do like the “predicted” in here
Intercept = a house of size zero (empty lot?) is predicted to cost $265,222 Be a little careful here, don’t have any houses in data
set with size near 0…
![Page 24: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/24.jpg)
Using the model
Price-hat = 265222 + 169 size Predicted price for a 1250 square foot house?
Predicted price for a 3000 square foot house? Extrapolation: Very risky to use regression equation to
predict values far outside the range of x values used to derive the line!
![Page 25: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/25.jpg)
4) Is this relationship statistically significant? Is it possible there is no relationship between
house price and size in the population of all homes for sale at that time, and we just happened to coincidently obtain this relationship in our random sample?
Or is this relationship strong enough to convince us it didn’t happen just by chance but reflects a genuine relationship in the population?
![Page 26: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/26.jpg)
p. 605
Let represent the slope of the population regression line
H0: = 0; no relationship between price and size in population
Ha: ≠ 0; is a relationship < negative; > positive
Idea: Want to compare the observed sample slope to zero, does it differ more than we would expect by chance?
![Page 27: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/27.jpg)
Assume = 0
How many standard deviations away?
Variation in sample slopes
Sample slopes our slope?
Standard error = SE(b)
169
![Page 28: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/28.jpg)
Minitab
The regression equation is
Price = 265222 + 169 Size (sq ft)
Predictor Coef SE Coef T P
Constant 265222 42642 6.22 0.000
Size (sq ft) 168.59 31.88 5.29 0.000
Regression equation(add hat)
b
a
SE(b) Two-sided
t=(observed slope-hypothesized slope)standard error of slope= (b – 0)/SE(b)= (168.59-0)/31.88 = 5.29
![Page 29: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/29.jpg)
Turn in, with partner Price vs. pages: Interpret slope/evaluate p-value
For Tuesday Activities 26-7, 28-5 Be working on Lab 9 and HW 7
The regression equation is Price = - 3.4 + 0.147 Pages
Predictor Coef SE Coef T PConstant -3.42 10.46 -0.33 0.746Pages 0.14733 0.01925 7.65 0.000
![Page 30: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/30.jpg)
Describing Scatterplots
Activity 26-6 (p. 539)
Positive, nonlinear, fairly strong Causation?
Strength: How closely do the points follow the pattern?
DirectionStrengthForm: Linear or not
![Page 31: Stat 217 – Day 25 Regression. Last Time - ANOVA When? Comparing 2 or means (one categorical and one quantitative variable) Research question Null](https://reader035.vdocuments.site/reader035/viewer/2022062423/56649d6a5503460f94a47da2/html5/thumbnails/31.jpg)
For Monday
Activities 26-7, 28-5 Be working on Lab 9 and HW 7