population mean. problem. notation - michigan state … mean. problem. notation . populati ... ti...
TRANSCRIPT
![Page 1: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/1.jpg)
RECALL: In last class, we learned statistical inference for population mean.
Problem. Notation
Population Notation
Meaning
The population mean
𝑋� The sample mean
𝜎 The population standard deviation
s The sample standard deviation n The sample size
![Page 2: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/2.jpg)
RECALL:
Point estimation. (sample mean ) Distribution of
Confidence Interval One-sample z-interval (population SD is known) One-sample t-interval (only sample SD is known) Remark: 1. T-interval needs normal assumption. 2. , which is related to n-1 and C%, can be obtained from t-table.
XX
*1−nt
nstX n
*1−±
![Page 3: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/3.jpg)
RECALL: Hypothesis Testing about 𝜇
Z-Test (population SD is known) Test statistic: P-value:
Null Hypothesis H0 vs. Alternative Hypothesis HA
H0 : vs.
HA : (two-sided)
HA : (one-sided)
HA : (one-sided)
Alternative Hypothesis HA P-value formula
HA : (two-sided) P-value=2P(Z>|z|)
HA : (one-sided) P-value=P(Z>z)
HA : (one-sided) P-value=P(Z<z)
![Page 4: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/4.jpg)
RECALL: Hypothesis Testing about 𝜇
T-Test (sample SD s is known) Test statistic: P-value:(df=n-1)
Null Hypothesis H0 vs. Alternative Hypothesis HA
H0 : vs.
HA : (two-sided)
HA : (one-sided)
HA : (one-sided)
Alternative Hypothesis HA P-value formula
HA : (two-sided) Two-tail prob. of |t|
HA : (one-sided) One-tail prob. of |t|
HA : (one-sided) One-tail prob. of |t|
ns
Xt 0µ−=
![Page 5: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/5.jpg)
RECALL: TI commands (under STATTESTS): T-interval: use 8: T Interval T-Test: use 2:T-Test Decisions: If p-value< alpha level, reject H0, and we say the test is statistically significant at this alpha level); If p-value>alpha level, fail to reject H0, and we say the test is not statistically significant at this alpha level); Errors: Type I error: decide to reject H0, but actually H0 is true; Type II error: decide to retain H0, but actually H0 is false; P(Type I error)=alpha level.
![Page 6: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/6.jpg)
Exploring Relationship Between Variables
Chapter 7: Scatterplots, Association, and Correlation Chapter 8: Linear Regression
![Page 7: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/7.jpg)
WHERE ARE WE GOING? People might ask the following questions in the real
life: 1. Is the price of sneakers related to how long they last? 2. Is smoking related to lung cancer? 3. Do baseball teams that score more runs sell more tickets to
their games?
Chapter 7 will look at relationships between two quantitative variables X and Y. Scatterplot Correlation
![Page 8: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/8.jpg)
TERM 1: SCATTERPLOTS Is the price of sneakers related to how long they last?
Following table shows some data collected for sneakers:
0
10
20
30
40
50
60
70
0 2 4 6 8 10 12
Price Years Price($) 1 20.00 2 21.99 3 23.29 4 25.99 5 29.99 6 34.99 7 39.99 8 44.99 9 49.99
10 59.99
This is an example of scatterplot. x-axis represents variable years and y-axis represents prices.
![Page 9: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/9.jpg)
TERM 1: SCATTERPLOT Scatterplots may be the most common and most
effective display for paired data.
Scatterplots are the best way to start observing the relationship and the ideal way to picture associations between two quantitative variables
010203040506070
0 2 4 6 8 10 12
Price X-axis: Years, Explanatory variable which explains or influences changes in the other variable. Y-axis: Price, Response variable which measures an outcome of a study.
![Page 10: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/10.jpg)
TERM 1: SCATTERPLOTS
How do we describe the scatterplot? Or, What information about the relationship of the two variables can we get by looking at the scatterplot?
Please look at the scatterplot of the sneakers example, and think about what can you tell about the relationship of years and price.
010203040506070
0 2 4 6 8 10 12
Price We are going to describe the relationship from four different aspects. 1) Direction 2) Form 3) Strength 4) Unusual features
![Page 11: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/11.jpg)
TERM 1: SCATTERPLOT Look for direction: What’s my
design—positive, negative or neither? Negative A pattern like this that runs from the upper left to the lower right is said to be negative. Y variable decreases as the X variable increases. Positive
A pattern running the other way is called positive.
Y variable increases as X variable increases.
0 10 20 30 40 50
05
1015
Scatterplot
X
Y
0 10 20 30 40 50
-10
-50
Scatterplot
X
Y
![Page 12: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/12.jpg)
TERM 1: SCATTERPLOT The example in
the text shows a negative association between central pressure and maximum wind speed
As the central pressure increases, the maximum wind speed decreases
![Page 13: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/13.jpg)
TERM 1: SCATTERPLOTS Look for Form: straight, curved or something
exotic, or no pattern?
0 2 4 6 8 10
05
1015
2025
30
Scatterplot
X
Y
0 2 4 6 8 10
050
010
0015
0020
0025
0030
00
Scatterplot
X
Y
0 2 4 6 8 10
-2-1
01
2
Scatterplot
X
Y
Straight line, linear Curved No pattern
In this part, we are more interested in the linear pattern.
![Page 14: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/14.jpg)
TERM 1: SCATTERPLOTS Look for strength: how much scatter? Or, how strong
the relationship is? Strong: the points appear tightly clustered in a single stream.
Weak: the swarm of points seem to form a vague cloud through which we can barely discern any trend or pattern
0 2 4 6 8 10
05
1015
2025
30
Scatterplot
X
Y
0 2 4 6 8 10
02
46
810
Scatterplot
X
Y
0 2 4 6 8 10
-10
12
34
56
Scatterplot
X
Y
0 2 4 6 8 10
-2-1
01
2
Scatterplot
X
Y
![Page 15: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/15.jpg)
TERM 1: SCATTERPLOTS Look for the Unusual Features: Are there
outliers or subgroups?
0 2 4 6 8 10
-20
24
68
10
Scatterplot
X
Y
0 5 10 15
05
1015
2025
30
Scatterplot
X
Y
The point circled is a potential outlier There are two clusters.
![Page 16: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/16.jpg)
Slide 1- 16
TERM 1: SCATTERPLOT-ROLES FOR VARIABLES
It is important to determine which of the two quantitative variables goes on the x-axis and which on the y-axis.
This determination is made based on the roles played by the variables.
When the roles are clear, the explanatory or predictor variable goes on the x-axis, and the response variable goes on the y-axis.
![Page 17: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/17.jpg)
TERM 1: SCATTERPLOTS Summary
A Scatterplot shows the relationship between two quantitative variables measured on the same individual.
The variable that is designated the X variable is called the explanatory variable
The variable that is designated the Y variable is called the response variable
Always plot the explanatory variable on the horizontal (x) axis
Always plot the response variable on the vertical (y) axis
In examining scatterplots, look for an overall pattern showing the form, direction and strength of the relationship
Look also for outliers or other deviations from this pattern
![Page 18: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/18.jpg)
TERM 1: SCATTERPLOT Example: Fast food is often considered unhealthy because
much of it is high in fat. Are fat and calories related? Here are the fat and calories contents of several brands of burgers. Analyze the association between fat content and calories.
Fat(g) 20 30 35 36 40 40 44 Calories 410 580 590 570 640 680 660
400
500
600
700
18 28 38 48
Cal
orie
Fat
Comment on the scatterplot: 1) Direction Positive 2) Form Roughly linear 3) Strength Moderately strong 4) Unusual features No.
![Page 19: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/19.jpg)
TERM 2: CORRELATION From scatterplots, we can look for the relationship between two
quantitative variables and whether the relationship is strong or weak. But how strong is it?
Correlation coefficient (or simply correlation) is a quantitative measure of linear relationship (association) between two quantitative variables.
Finding the correlation coefficient, denoted by r, by hand:
Where and are standard deviations for X and Y respectively.
Remarks: Before you use correlation, you must check several conditions:
Quantitative Variables Condition Straight Enough Condition Outlier Condition
yxssnyyxx
r)1(
))((−
−−= ∑
xs ys
![Page 20: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/20.jpg)
TERM 2: CORRELATION (Revisit the calories example) Here are the fat and calories
contents of several brands of burgers.
What is the correlation coefficient of x (fat) and y (calories)? Solution:
Add up the products: 2700+50+0+(-20)+250+450+630=4060 Correlation r=4060/{(7-1)*7.98*89.81}=0.9442
Deviations in x Deviations in y Product 20-35=-15 410-590=-180 (-15)*(-180)=2700 30-35=-5 580-590=-10 (-5)*(-10)=50 35-35= 0 590-590= 0 0*0=0 36-35= 1 570-590=-20 1*(-20)=-20 40-35= 5 640-590= 50 5*50=250 40-35= 5 680-590= 90 5*90=450 44-35= 9 660-590= 70 9*70=630
X: Fat(g) 20 30 35 36 40 40 44 Y: Calories 410 580 590 570 640 680 660
![Page 21: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/21.jpg)
TERM 2: CORRELATION
![Page 22: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/22.jpg)
Slide 1- 22 CORRELATION PROPERTIES The sign of a correlation coefficient gives the
direction of the linear association. Positive sign Positive linear association Negative sign Negative linear association Correlation is always between -1 and +1.
Correlation can be exactly equal to -1 or +1, but these values are unusual in real data because they mean that all the data points fall exactly on a single straight line.
A correlation near zero corresponds to a weak linear association.
Example: The correlation between fat and calories as 0.9442 indicates a strong positive linear association between them.
![Page 23: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/23.jpg)
TERM 2: CORRELATION Cautions about correlation:
Quantitative Variables Condition: Correlation applies only to quantitative variables.
Straight Enough Condition: Correlation measures the strength only of the linear association.
Outlier Condition: Outliers can distort the correlation dramatically.
-2 -1 0 1 2
-4-2
02
4
x
y
r=0.92 -2 -1 0 1 2
-20
24
68
x
y
r=0.098
-2 -1 0 1 2
-50
510
x
y
With the outlier: r=0.795
Without the outlier: r=0.938
![Page 24: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/24.jpg)
TERM 2: CORRELATION Correlation≠Causation
Fast food is often considered unhealthy because much of it is high in fat. Are fat and calories related? Based on the fat and calories contents of several brands of burgers, the correlation between them is r=0.9442. Which conclusion is most accurate?
A. More fat in the burgers causes higher calories B. The burgers containing more fat tend to have higher
calories Comment: Even though A sounds all right, it is not the conclusion can
be derived/explained by the correlation. Correlation is an objective story teller of the linear
association between two variables. It can’t tell the causation.
![Page 25: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/25.jpg)
Slide 1- 25 CORRELATION PROPERTIES (CONT.) Correlation treats x and y symmetrically:
The correlation of x with y is the same as the correlation of y with x.
Correlation has no units. Correlation is not affected by shifting and
rescaling of either variable. Correlation depends only on the z-scores, and they
are unaffected by changes in center or scale. i.e. corr(aX+b,cY+d)=corr(X,Y) where a,b,c,d are
constants.
![Page 26: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/26.jpg)
TERM 2: CORRELATION Example: Here are several scatterplots. The calculated
correlations are -0.923, -0.487, 0.006 and 0.777. Which is which?
-10 -5 0 5 10
-120
-80
-40
020
(a)
X
Y
-10 -5 0 5 10
-20
-10
010
20
(b)
X
Y
-10 -5 0 5 10
-20
-10
010
20
(c)
X
Y
-10 -5 0 5 10
-20
-10
010
2030
(d)
X
Y
-0.923
0.006 0.777
-0.487
![Page 27: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/27.jpg)
QUESTION: CAN WE DO MORE? Scatterplot and correlation are useful tolls
helping us to learn the (linear) association between two quantitative variables.
Can we answer the following question: Fast food is often considered unhealthy because much of it is high in fat. What is the calorie content of a kind of fast food with 28g fat?
400450500550600650700
18 28 38 48Fat
Cal
orie
If we want to estimate a unknown value based on the known values, this is called a prediction. One way to do the prediction is by constructing a linear model.
![Page 28: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/28.jpg)
TERM 3: LINEAR MODEL Let’s look at the burger example again.
Fat(g) 20 30 35 36 40 40 44 Calories 410 580 590 570 640 680 660
20 25 30 35 40
400
450
500
550
600
650
BURGERS
FAT
CA
LOR
IES
The red line does not go through all the points, but it can summarize the general pattern with only a couple of parameters: Calories = a+b*fat. This model can be used to predict the Calories based on the fat contain. Explanatory Var: Fat Response Var: Calories
![Page 29: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/29.jpg)
TERM 3: LINEAR MODEL
20 25 30 35 40
400
450
500
550
600
650
BURGERS
FAT
CA
LOR
IES
residual
Predicted value: we call the estimate made from a model the predicted value, denoted as . Residual: The difference between the observed value and its associated predicted value is called the residual. The line of best fit is the line for which the sum of the squared residuals is smallest. And it’s called the least squares line.
y
Prediction
![Page 30: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/30.jpg)
TERM 3: LINEAR MODEL
![Page 31: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/31.jpg)
TERM 3: LINEAR MODEL X: Fat(g) 20 30 35 36 40 40 44 Y: Calories 410 580 590 570 640 680 660
Fat: Calories: Correlation: r=0.9442 Slope: Intercept: Linear model: Q2: What is the predicted calorie when the fat is 30g? When x=30, Q3: What is the residual for the burger with 30g fat? When x=30, the residual is
20 25 30 35 40
400
450
500
550
600
650
BURGERS
FAT
CA
LOR
IES
=210.8+11.06x
Q1: Please construct a linear regression model to predict the calories based on fat.
![Page 32: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/32.jpg)
TERM 3: LINEAR MODEL Remarks: Since regression and correlation are closely
related, we need to check the same conditions for regressions as we did for correlations: Quantitative Variables Condition Straight Enough Condition Outlier Condition
![Page 33: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/33.jpg)
TERM 3: LINEAR MODEL (PARAMETERS) We write a and b for the slope and intercept of the
line. They are called the coefficients of the linear model.
The coefficient b is the slope, which tells us how rapidly the predicted value ( ) changes with respect to x. As the value of x increases by 1 unit, the predicted value of y will be increased by b units.
The coefficient a is the intercept, which tells where the line hits (intercepts) the y-axis. In other words, the intercept a is the predicted value of y when x=0
y
![Page 34: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/34.jpg)
Intercept and Slope (examples) Fast food is often considered unhealthy because much of it
is high in fat. Are fat and calories related? Here are the fat and calories contents of several brands of burgers. To analyze the association between fat content and calories, the equation of the regression model is: Predicted calories=217.95+10.63*fat For this linear equation, slope=10.63, intercept=217.95
Q1: What does the slope 10.63 mean? A1: An increase in fat of 1 gram is associated with an increase in
calories of 10.63. Q2: If the fat increases by 2 grams, how many more calories are
expected to be contained in the burger? A2: 2*10.63=21.26 Q3: What does the intercept 217.95 mean here? A3: Theoretically, it means: when the burger contains no fat at all,
the amount of calories is 217.95.
![Page 35: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/35.jpg)
TERM 4: RESIDUAL PLOT After you construct the linear model, you have to check whether
the linear model makes sense or not. Residual plot can be used to check the appropriateness of the
linear model. Residual plot is the scatterplot of the residuals versus the x-
values. If a linear model is appropriate, then the residual plot shouldn’t have any interesting features,
like a direction or shape. It should stretch horizontally, with about the same amount
of scatter throughout. It should show no bends, and it should have no outliers.
-10 -5 0 5 10
-2-1
01
2
X
Residu
als
![Page 36: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/36.jpg)
TERM 4: RESIDUAL SCATTERPLOT Now, let’s try to diagnose the model for the calorie
and fat example. Fat(g): x 20 30 35 36 40 40 44 Calories: y 410 580 590 570 640 680 660 Predicted calories: 430.6 536.9 590 600.6 643.2 643.2 685.7 Residual: -20.6 43.1 0 -30.6 -3.2 36.8 -25.7
20 25 30 35 40
-30
-20
-10
010
2030
40
fat
resi
dual
s
Residual plot
x
![Page 37: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/37.jpg)
TERM 4: RESIDUAL PLOT Example: Tell what each of the residual plots below
indicates about the appropriateness of the linear model that was fit to the data.
-2 -1 0 1 2
-2-1
01
2
(a)
x1
y1
-2 -1 0 1 2
-6-5
-4-3
-2-1
01
(b)
x2
y2
-2 -1 0 1 2
-4-2
02
46
(c)
x3
y3
(a) (b) (c)
![Page 38: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/38.jpg)
TI for correlation and regression equation The first time you do this:
Press 2nd, CATALOG (above 0) Scroll down to DiagnosticOn Press ENTER, ENTER Read “Done” Your calculator will remember this setting even when turned
off
Enter predictor (x) values in L1 Enter response (y) values in L2
Pairs must line up There must be the same number of predictor and response
values
Press STAT, > (to CALC) Scroll down to 8:LinReg(a+bx), press ENTER, ENTER Read intercept a, slope b and correlation r at the screen
![Page 39: population mean. Problem. Notation - Michigan State … mean. Problem. Notation . Populati ... TI commands (under STAT TESTS): ... -5 0 5 10 x y With the outlier: r=0.795](https://reader031.vdocuments.site/reader031/viewer/2022022513/5aecf3fd7f8b9a3b2e8fd421/html5/thumbnails/39.jpg)
IMPORTANT NOTES: Take-home quiz is due on Monday. No late
submission will be accepted. Keep the ID assignment and bring it to class on
Monday. Sample exam will be handed out on Monday. We
will discuss the questions on Wednesday. Suggested Problem Set 4 will be collected on
next Thursday. Final exam will be on next Thursday. 2 hours in
class. Please prepare one page A4 size cheat sheet (one-sided) on your own. Formula sheet will not be provided in final exam. Cheat sheet will be collected together with the final exam.