chapter 3 examining relationships
DESCRIPTION
Chapter 3 Examining Relationships. Section 3.1 Scatterplots. Terms to Know. A response variable measures an outcome of a study. An explanatory variable attempts to explain the observed outcomes. Example of an Explanatory and Response Variable. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/1.jpg)
Chapter 3Examining Relationships
Section 3.1 Scatterplots
![Page 2: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/2.jpg)
Terms to Know
A response variable measures an outcome of a study. An explanatory
variable attempts to explain the observed outcomes.
![Page 3: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/3.jpg)
Example of an Explanatory and Response Variable
One degree day is accumulated for each degree a day’s average temp falls below or rises above 65 degrees.
![Page 4: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/4.jpg)
Key Concept
The statistical techniques used to study relations among variables are more complex than one-variable methods.
Fortunately we build on the tools used for examining individual variables. The principles that guide
examination are the same.
1. Start with a graph
2. Look for an overall pattern and deviations from the pattern
3. Add numerical descriptions of specific aspects of the data
4. Sometimes there is a way to describe that
![Page 5: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/5.jpg)
Term to Know
The most effective way to display the relation between two quantitative variables
is a scatterplot. Plot the explanatory variable, if there is one, on the x-axis, and the response variable on the y-axis. Each individual in the data appears as a point.
![Page 6: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/6.jpg)
ScatterPlot
![Page 7: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/7.jpg)
Interpreting Scatterplots
To interpret a scatterplot, look first for a pattern. The pattern should reveal direction, form and strength of the relationship between two variables.
Refer to Figure 3.1 on page 175. Form: two clusters
Direction: Negatively associatedStrength: moderate
![Page 8: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/8.jpg)
Strong, Positive Association with Linear Form
![Page 9: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/9.jpg)
Some Relationships Have No Direction or Pattern
![Page 10: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/10.jpg)
Not All Relationships are Linear
Mile
age
4
6
8
10
12
14
16
18
20
22
Speed0 20 40 60 80 100 120 140 160
Collection 1 Scatter Plot
![Page 11: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/11.jpg)
Yes, you can have outliers on ScatterPlots
An outlier in any graph of data is an individual observation that falls outside the overall pattern of the graph (ie: WV)
![Page 12: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/12.jpg)
Add a Third Variable (Categorical) of Southern and non-Southern by Using Different Symbols
![Page 13: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/13.jpg)
Scatter Plot Heads Up
When several individuals have exactly the same data, they occupy the same point on the scatter plot. Some software packages address the issue by using different symbols for multiple individuals with the same data. You can do the same by hand. However, your calculator does not. So be careful. Use trace to identify such cases.
![Page 14: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/14.jpg)
Scatterplots display direction, form, strength and relationship between two variables. However, our eyes are not a good judge of the strength of the relationship.
![Page 15: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/15.jpg)
Key Concept
Correlation measure the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r.
y
i
x
i
syy
sxx
nr
11
![Page 16: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/16.jpg)
Facts About Correlation• No distinction between
explanatory and response variable
• Requires two quantitative variables
• Unit change of observation does not change correlation
• Positive r indicates positive association, negative r indicates negative association
• Range: -1 < r < 1 • Measures strength of linear
relationships of two variables only
• Is not resistant to outliers
![Page 17: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/17.jpg)
Correlation Exercise
• Technology Toolbox, page 186• Yes, The process is long and convoluted,
but there is a shortcut using LinReg Command
![Page 18: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/18.jpg)
Key Concept
1) A key thing to remember when working with correlations is never to assume a correlation means that a change in one variable causes a change in another. Sales of personal computers and athletic shoes have both risen strongly in the last several years and there is a high correlation between them, but you cannot assume that buying computers causes people to buy athletic shoes (or vice versa).
![Page 19: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/19.jpg)
Key Concept
2) Correlation only describes linear relationships only, now matter how strong how strong the curved relationship may be.
3) Like mean and standard deviation, correlation, r, is not resistant to outliers
4) Correlation is not a complete summary of a two variable relationship. You should give the means of x and y.
![Page 20: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/20.jpg)
Homework
• Read 3.2• Complete problems 1, 2, 6, 7 ,8, 13, 15,
19, 21, 23
![Page 21: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/21.jpg)
Chapter 3Examining Relationships
Section 3.2 Least-Squares Regression
![Page 22: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/22.jpg)
Key Term
Least Squares Regression is a method for finding a line the summarizes the relationship between two variables that show a linear trend.
We often use a regression line to predict the value of y for a given value of x. Regression, unlike correlation requires that we have an explanatory variable and a response variable.
![Page 23: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/23.jpg)
Regression Line for Predicting Gas Consumption from Degree Days
![Page 24: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/24.jpg)
Why is it Called A Least-Squares Regression Line (“LSRL”)?
![Page 25: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/25.jpg)
Why is it Called A Least-Squares Regression Line (“LSRL”)?
![Page 26: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/26.jpg)
Correlation and Regression Appletwww.whfreeman.com
![Page 27: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/27.jpg)
LSRL – Using TI84
NEAΔ(cal)
-94 -57 -29 135 143 151 245 355 392 573 486 535 571 580 620 690
FatΔ(kg)
4.2 3.0 3.7 2.7 3.2 3.6 2.4 1.3 3.8 1.7 1.6 2.2 1.0 0.4 2.3 1.1
Enter NEA data in L1 and Fat data in L2
![Page 28: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/28.jpg)
NEA/Fat Least-Squares Regression Line Exercise
Complete Technology Toolbox on page 210
![Page 29: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/29.jpg)
Interpret you regression equation in terms of your variables
(ie: fat gain = a + b(NEA change)
![Page 30: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/30.jpg)
Use your Model to predict weight gain given an NEA of 400
(interpolation)
Use your Model to predict weight gain given an NEA of 1000
(extrapolation)
![Page 31: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/31.jpg)
Equation of the Least-Squares Regression Line
• You can manually calculate the equation of the Least-Squares Regression Line
bxay ˆWith slope
x
y
ssrb
And Intercept
xbya
![Page 32: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/32.jpg)
Homework
• Exercises 3.29 – 32, 35, 36• Read Section 3.3
![Page 33: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/33.jpg)
Chapter 3Examining Relationships
Section 3.2 Least-Squares Regression (Continued)
Section 3.3 Correlation and Regression Wisdom
![Page 34: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/34.jpg)
Key Concept
• A residual is the difference between and observed value of the response variable and the value predicted by the regression line. That is,
residual = observed y – predicted y
yyresidual ˆ
![Page 35: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/35.jpg)
Residual is the distance between actual and predicted y
![Page 36: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/36.jpg)
Example residual plot
![Page 37: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/37.jpg)
Interpreting a Residual Plot• The uniform scatter of points indicates the
regression line fits the data well, so the line is a good model.
![Page 38: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/38.jpg)
Interpreting a Residual Plot
• The residual have a curved pattern, so a straight line is an inappropriate model
![Page 39: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/39.jpg)
Interpreting a Residual Plot
• The response variable y has more spread for larger values of the explanatory variable x, so prediction will be less accurate when x is large.
![Page 40: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/40.jpg)
Create a Residual Plot with Hand-Span Data
Follow procedures detailed in Technology toolbox on page 219
![Page 41: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/41.jpg)
Key Concept
The coefficient of determination, r2, is the fraction of the variation in the values of y that is explained by least-squares regression of y on x.
Eg: ____% of the variation in Height is accounted for by the linear relationship between hand size and Height.
![Page 42: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/42.jpg)
Facts about Least-Squares Regression
• Fact 1 – The distinction between explanatory and response variables is essential in regression
![Page 43: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/43.jpg)
Facts about Least-Squares Regression
• Fact 2 – There is a close connection between correlation and the slope of the least squares line. The slope is:
x
y
ssrb
![Page 44: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/44.jpg)
Facts about Least-Squares Regression
• Fact 3 – The least squares line always passes through the point
),( yx
![Page 45: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/45.jpg)
Constructing the Least-Squares Example
Suppose we have explanatory and response variables and we know that the mean of x=17, mean of y=161.111, sx=19.696, sy=33.479 and the correlation r = .997. Even though we don’t know the actual data, we can still construct the equation for the least-squares line and use it to make predictions.
![Page 46: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/46.jpg)
Constructing the Least-Squares Example
695.1696.19479.33997.
x
y
ssrb
920.131)222.17)(695.1(111.161 xbya
So the Least-squares Line has an equation
xy 695.1920.131ˆ
![Page 47: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/47.jpg)
Facts about Least-Squares Regression
• Fact 4 – The correlation r describes the strength of a straight-line relationship. In the regression setting, this description takes a specific form: The square of the correlation, r2 , is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x.
![Page 48: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/48.jpg)
Key Concept
• Correlation and regression describe only linear relationships
• Extrapolation (using a model outside of the range of the data) often produces unreliable predictions
![Page 49: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/49.jpg)
Outliers and Influential Observations in Regression
• An outlier is an observation that lies outside the overall pattern of the other observations
• An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the x direction of a scatter plot are often influential for the least-squares regression line. (Example: Revisit correlation applet)
![Page 50: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/50.jpg)
Child 19 and Child 18 are both outliers. Child 18 is more influential.
![Page 51: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/51.jpg)
Beware the Lurking Variable• A lurking variable is a variable that is not
among the explanatory or response variables in a study and yet may influence the interpretation of relationships among those variables.
• Examples: – 1 A strong positive correlation exist
between the weight and reading skills of elementary school children.
– 2 Methodist Preacher and Whisky. • What are the lurking variables?
![Page 52: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/52.jpg)
Beware of Correlations Based on Averaged Data
• Correlations based on average data are usually too high when applied to individuals.
• Example: age vs height of individual young children and average age vs average height of young children.
• Variation decreases with averaged data
![Page 53: Chapter 3 Examining Relationships](https://reader035.vdocuments.site/reader035/viewer/2022062502/56814c50550346895db95ff4/html5/thumbnails/53.jpg)
Homework
• Exercises 37, 38, 39, 40, 48, 50, 64, 66, 67, 71
• Take Home Quiz