relationships

13
Relationships e doing a study which involves more than one variab tell if there is a relationship between two (or mor es ? ssociation Between Variables : Two variables measured on the same individuals are associated if some values of one variable tend to occur more often with some values of the second variable than with other values of that variable. Response Variable : A response variable measures an ou a study. Explanatory Variable : An explanatory variable explains es changes in the response variable.

Upload: rose-tate

Post on 01-Jan-2016

24 views

Category:

Documents


1 download

DESCRIPTION

Relationships. If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the variables ?. Association Between Variables :. Two variables measured on the - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Relationships

Relationships• If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the variables ?

• Association Between Variables : Two variables measured on thesame individuals are associated if some values of one variabletend to occur more often with some values of the secondvariable than with other values of that variable.

• Response Variable : A response variable measures an outcomeof a study.

• Explanatory Variable : An explanatory variable explains or causes changes in the response variable.

Page 2: Relationships

Scatterplots• A scatterplot shows the relationship between two variables.

• The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis.

• Always plot the explanatory variable on the horizontal axis, and the response variable as the vertical axis.

Example: If we are going to try to predict someone’s weight from theirheight, then the height is the explanatory variable, and the weight isthe response variable.

• The explanatory variable is often denoted by the variable x, and is sometimes called the independent variable.

• The response variable is often denoted by the variable y, and is sometimes called the dependent variable.

Page 3: Relationships

ScatterplotsExample: Do you think that a father’s height would affect a son’s height?

We are saying that given a father’s height, can we make any determinations about the son’s height ?The explanatory variable is : The father’s height

The response variable is : The son’s height

Data Set : Father’s Height Son’s Height

64 6568 6768 7070 7272 7574 7075 7375 7676 7777 76

Page 4: Relationships

Father’s Height Son’s Height

64 6568 6768 7070 7272 75

Father’s Height Son’s Height

74 7075 7375 7676 7777 76

64 68 72 76

64

68

72

76

Explanatory Variable (Father’s Height)

Response Variable (Son’s Height)

Page 5: Relationships

Father’s Height Son’s Height

64 6568 6768 7070 7272 75

Father’s Height Son’s Height

74 7075 7375 7676 7777 76

64 68 72 76

64

68

72

76

Father

Son

Page 6: Relationships

Examining A Scatterplot• In any graph of data, look for the overall pattern and for striking striking deviations from that pattern.

• You can describe the overall pattern of a scatterplot by the form, direction, and strength of the relationship.

• An important kind of deviation is an outlier, an individual that falls outside the overall pattern of the relationship.

• Two variables are positively associated when above-average values of one tend to accompany above average values of the other and below average values also tend to occur together.

• Two variables are negatively associated when above-average values of one accompany below-average values of the other; and vice versa.

• Strength : How closely the points follow a clear form.

Page 7: Relationships

Examining A ScatterplotConsider the previous scatterplot :

64 68 72 76

64

68

72

76

Father

Son

Direction : Going up

Form : Linear

Association : Positive

Strength : Strong

Outliers : None

Page 8: Relationships

Example : The following is a scatterplot of data collected from statesabout students taking the SAT. The question is whether the percentageof students from a state that takes the test will influence the state’saverage scores.

For instance, in California, 45 % of high school graduates took the SATand the mean verbal score was 495.

Direction : Downward

Form : Curved

Association : Negative

Strength : Strong

Outliers : Maybe

Page 9: Relationships

Categorical Variables• To add a categorical variable to a scatterplot, use a different plot color or symbol for each category.

Example : Take the last scatterplot and mark the northeastern stateswith an “e” and the midwestern states with an “m” :

Notice the grouping :

Outliers ?

Page 10: Relationships

Notes• When we draw the line though the data set, we are drawing the model we want to use for the data set. We would like to find the equation for this line to help us understand the data. This is called “smoothing”.

(Figure 2.5) (Figure 2.6)

Page 11: Relationships

Notes• When we draw the line though the data set, we are drawing the model we want to use for the data set. We would like to find the equation for this line to help us understand the data. This is called “smoothing”.

• How can we display a relationship between a categorical explanatory variable, and a quantitative response variable :

• Use a back-to-back stemplot to compare the distributions

• Use side-by-side boxplots to compare any number of distributions.

Page 12: Relationships

Example : It would make sense that the more hours people work in a week would lead to higher wages. The Census Bureau publishesrelevant data. Unfortunately, “how much a person works appears as acategorical variable :

A = 26 weeks or less B = 27 to 39 weeks C = 50 weeks or more

Notice also that wages is a quantitative variable.

Page 13: Relationships

Homework1, 3, 4, 6, 7, 10, 13, 17