relationships between measurements variables

22
Relationships Between Measurements Variables Thought Questions 1. Judging from the scatterplot, there is a positive correlation between verbal SAT score and GPA. For used cars, there is a negative correlation between the age of the car and the selling price. Explain what it means for two variables to have a positive correlation or a negative correlation. 2. Do you think each of the following pairs of variables would have a positive correlation, a negative correlation, or no correlation? a.Calories eaten per day and weight b.Calories eaten per day and IQ c.Amount of alcohol consumed and accuracy on a manual dexterity test d.Number of ministers and number of liquor stores in cities in Pennsylvania e.Height of husband and height of wife

Upload: medea

Post on 05-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Relationships Between Measurements Variables. Thought Questions. 1. Judging from the scatterplot , there is a positive correlation between verbal SAT score and GPA. For used cars, there is a negative correlation between the age of the car and the selling price. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Relationships Between Measurements Variables

Relationships Between Measurements VariablesThought Questions

1. Judging from the scatterplot, there is a positive correlation between verbal SAT score and GPA. For used cars, there is a negative correlation between the age of the car and the selling price.

Explain what it means for two variables to have a positive correlation or a negative correlation.

2. Do you think each of the following pairs of variables would have a positive correlation, a negative correlation, or no correlation?

a.Calories eaten per day and weightb.Calories eaten per day and IQc.Amount of alcohol consumed and accuracy on a manual dexterity testd.Number of ministers and number of liquor stores in cities in Pennsylvaniae.Height of husband and height of wife

Page 2: Relationships Between Measurements Variables

Relationships Between Measurements Variables

3. An article in the Sacramento Bee (29 May, 1998, p. A17) noted “Americans are just too fat, researchers say, with 54 percent of all adults heavier than is healthy.

If the trend continues, experts say that within a few generations virtually every U.S. adult will be overweight.”

This prediction is based on “extrapolating,” which assumes the current rate of increase will continue indefinitely.

Is that a reasonable assumption? Do you agree with the prediction? Explain.

Thought Questions

Page 3: Relationships Between Measurements Variables

Relationships Between Measurements Variables

We start looking at the relationship between two quantitative variables by looking at scatterplotsand describing the essence of what we see.

Percentage who say they would vote for a woman presidentagainst the year of the survey

Looking at Scatterplots

Page 4: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Max Wind Speed (in mph) vs. Central Pressure (in mb) for 163 hurricanes that hit the United State since 1851

Form of the relationshipIf there is a straight line (linear) relationship, it will appear as a cloud or swarm of points stretched out in a generally consistent, straight form.

Looking at Scatterplots

Page 5: Relationships Between Measurements Variables

Relationships Between Measurements VariablesLooking at Scatterplots

Strength of the relationship

At one extreme, the points appear to follow a single stream

At the other extreme, the points appear as a vague cloud with no discernable trend or pattern:

Other forms of the relationship

The relationship isn’t straight, but curves gently, while still increasing or decreasing steadily

The relationship curves sharply

Page 6: Relationships Between Measurements Variables

Relationships Between Measurements VariablesMeasuring Strength Through CorrelationCorrelation measures the strength of the linear association between two quantitative variables. Correlation Conditions •Correlation applies only to quantitative variables

•Correlation measures the strength only of the linear association, and will be misleading if the relationship is not linear

•Outliers can distort the correlation dramatically

Correlation PropertiesThe sign of a correlation coefficient gives the direction of the association.

Characterizing the relationship

•Strong

•Moderate

•Weak

Page 7: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Husbands’ and Wifes’ Ages and Heights

What can you say about the strength of the relationships?

Scatterplot of British husbands’ and wives’ heights (in millimeters); r = .36

Scatterplot of British husbands’ and wives’ ages; r = .94

Measuring Strength Through Correlation

Page 8: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Measuring Strength Through Correlation

A scatterplot of percentage of teens who haveused other drugs vs. percentage who have usedmarijuana in the U.S. and 10 Western Europeancountries is at the right.

The correlation is r = 0.934.

Describe the association between the percent of teens who have used marijuana and the percent of teens who have used other drugs?

Do these results confirm that marijuana is a“gateway drug”?

Page 9: Relationships Between Measurements Variables

Relationships Can Be Deceiving

Non-linear relationships

Example: New York City Marathon Time

Page 10: Relationships Between Measurements Variables

Relationships Can Be Deceiving

Non-linear relationships

Scatterplot of average weight by height

The correlation is r=0.995 which suggestsa very strong linear relationship

Page 11: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Fat Versus Protein: An Example

The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu

Specifying Linear Relationships with Linear Regression

Burger King: Fat Versus Protein

Page 12: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Specifying Linear Relationships with Regression

Fat Versus Protein: An ExampleThe difference between the observed value and its associated predicted value is called the residual.

A negative residual means the predicted value’s too big (an overestimate).

A positive residual means the predicted value’s too small (an underestimate).

ˆresidual observed predicted y y

Page 13: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Specifying Linear Relationships with Regression

Goal: Find a straight line that comes as close as possible to the points in a scatterplot.

• Procedure to find the line is called regression.

• Resulting line is called the regression line.

• Formula that describes the line is called the regression equation.

• Most common procedure used gives the least squares regression line.

The Equation of the Line

• a = intercept – where the line crosses the vertical axis when x = 0. • b = slope – how much of an increase there is in y when x increases by one unit.

y = a + bx

Page 14: Relationships Between Measurements Variables

• The regression line for the Burger King data fits the data well:– The equation is

predicted fat = 6.411 + 0.976(protein)

The predicted fat content for a BK Broiler chicken sandwich with 30g of protein is

6.411 + 0.976(30) = 35.9 grams of fat.

For 31g of protein, the line predicts

6.411 + 0.976(31) = 36.876 grams of fat.

Relationships Between Measurements VariablesSpecifying Linear Relationships with Regression

Fat Versus Protein: An Example

Page 15: Relationships Between Measurements Variables

Relationships Between Measurements VariablesHow to Win Friends: Have a Big Amygdala? – Time Health, December 28th, 2010

Got a big social network? Then you probably have a large amygdala, according to a new study that found a connection between the size of this brain region and the number of social relationships a person has.

•The complexity of those relationships — as measured by the number of people who occupied multiple roles in a social network such as being simultaneously a friend and a co-worker — was also linked with amygdala size.

•So what does the amygdala actually do? "[It's] strongly connected with almost every other structure in brain. In the past, people assumed it was really important for fear. Then they discovered it was actually important for all emotions. And it's also important for social interaction and face recognition," Barrett says

•The research, which was published in Nature Neuroscience, found a moderate correlation between amygdala size and the number and complexity of social relationships in 58 healthy adults aged 19 to 83.

•Prior research has shown that people with autistic spectrum disorders have smaller amygdalas, which could help explain their social problems. But these studies cannot determine cause or effect — whether having a small amygdala makes socializing difficult, or whether lack of social interaction shrinks the amygdala

Page 16: Relationships Between Measurements Variables

Relationships Between Measurements VariablesStudy: Amygdala volume and social network size in humansWe found that amygdala volume correlates with the size and complexity of social networks in adult humans.

•These findings indicate that the amygdala is important in social behavior.

•Linear regression analyses revealed that individuals with larger and more complex social networks had larger amygdala volumes

•To further investigate the specificity of the relationship between amygdala volume and social network characteristics, we conducted an exploratory analysis assessing the relationship between social network variables and all other subcortical volumes segmented by FreeSurfer. Note: FreeSurfer is a set of automated tools for reconstruction of the brain's cortical surface from structural MRI data •Linear regressions revealed that none of the other subcortical regions significantly correlated with either social network variable when controlling for age and correcting for multiple comparisons.

Page 17: Relationships Between Measurements Variables

Relationships Between Measurements VariablesStudy: Amygdala volume and social network size in humans

Figure 1a – A Closer Look

y: Total number of people in social network (psn) x: amygala volume (av)

psn = 9 + 0.38 (av) where intecept = 9 and slope = 0.38

av= 3: psn = 9 + 0.38 (3) => psn = 10.14

av=4: psn = 9 + 0.38 (4) => psn = 10.52

Page 18: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Husbands’ and Wifes’ Ages, Revisited

Scatterplot of British husbands’ and wives’ ages with regression equation: y = 3.6 + 0.97x

husband’s age = 3.6 + (.97)(wife’s age)

Intercept: has no meaning.

Slope: for every year of difference in two wives ages, there is a difference of about 0.97 years in their husbands ages.

Page 19: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Danger of Extrapolation: Reaching beyond the Data

Linear models give a predicted value for each case in the data.We cannot assume that a linear relationship in the data exists beyond the range of the data.

Once we venture into new unknown territory, such a prediction is called an extrapolation.

A regression of mean age at first marriage for men vs. year fit to the years from 1890 - 1998 does not hold for later years:

Example

After 1950, linearity did not hold

Page 20: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Linear Regression in Excel

Correlation r = 0.829 Coefficient of Determination r² = 0.683

Coefficient of Determination, r², is the proportion of variation in Protein (x) than can be explained by a linear relationship with Total Fat (y)

Page 21: Relationships Between Measurements Variables

Relationships Between Measurements Variables

Scatterplot of British husbands’ and wives’ heights (in millimeters); r = .36

Text Questions

Page 22: Relationships Between Measurements Variables

Relationships Between Measurements VariablesText Questions