chapter fourteen examining associations: correlation and regression

56
Chapter Fourteen Examining Associations: Correlation and Regression

Post on 20-Dec-2015

231 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter Fourteen Examining Associations: Correlation and Regression

Chapter FourteenExamining

Associations: Correlation and

Regression

Page 2: Chapter Fourteen Examining Associations: Correlation and Regression

Did You Know that Degree, Color and Race Make a Difference in Home Refinancing?

• A study found that broker fees for purchasers without a college degree pay $1,472 more than those with a college degree– No only did a degree matter, but race was also a

factor. African Americans on average paid $500 more than whites, Hispanics $275 more than whites.

– Regression analysis was used to determine whether various borrower characteristics had a bearing on the amount of broker fees and closing costs paid.

Page 3: Chapter Fourteen Examining Associations: Correlation and Regression

Did You Know that Disaster Area Declarations are Related to Electoral Votes?

• A study revealed states that had been declared disaster areas are crucial to presidential elections

• Regression analysis revealed that states that are likely to be declared disaster areas were the states that were highest in electoral votes

Page 4: Chapter Fourteen Examining Associations: Correlation and Regression

Did You Know that the Presence of an NFL Team Boost Rental Costs?

• Regression analysis has revealed that in cities with an NFL team, rental costs for apartment in the central city area were 8 percent higher than in cities without an NFL team

• Property tax receipts were also found to be higher in cities with NFL teams

Page 5: Chapter Fourteen Examining Associations: Correlation and Regression

Overview of Techniques for Examining Associations

• Spearman Correlation Coefficient Technique• The technique is appropriate when

– The degree of association between two sets of ranks (pertaining to two variables) is to be examined

• Illustrative research question(s) this technique can answer– Is there a significant relationship between motivation

levels of salespeople and the quality of their performance?

• Assume that the data on motivation and quality of performance are in the form of ranks, say, 1 through 20, for 20 salespeople who were evaluated subjectively by their supervisor on each variable

Page 6: Chapter Fourteen Examining Associations: Correlation and Regression

Overview of Techniques for Examining Associations (Cont’d)

• Pearson Correlation Coefficient Technique• This technique is appropriate when

– The degree of association between two metric-scaled (interval or ratio) variables is to be examined

• Illustrative research question(s) this technique can answer– Is there a significant relationship between customers'

age (measured in actual years) and their perceptions of our company's image (measured on a scale of 1 to 7)?

Page 7: Chapter Fourteen Examining Associations: Correlation and Regression

Overview of Techniques for Examining Associations (Cont’d)

• Simple Regression Analysis Technique– This technique is appropriate when

• A mathematical function or equation linking two metric-scaled (interval or ratio) variables is to be constructed, under the assumption that values of one of the two variables is dependent on the values of the other

Page 8: Chapter Fourteen Examining Associations: Correlation and Regression

Overview of Techniques for Examining Associations – Simple Regression Analysis (Cont’d)

• Illustrative Research Question(s) this Technique Can Answer– Are sales (measured in dollars) significantly

affected by advertising expenditures (measured in dollars)?

– What proportion of the variation in sales is accounted for by variation in advertising expenditures? How sensitive are sales to changes in advertising expenditures?

Page 9: Chapter Fourteen Examining Associations: Correlation and Regression

Overview of Techniques for Examining Associations (Cont’d)

• Multiple Regression Analysis Technique– This technique is appropriate

• Under the same conditions as simple regression analysis except that more than two variables are involved wherein one variable is assumed to be dependent on the others

Page 10: Chapter Fourteen Examining Associations: Correlation and Regression

Overview of Techniques for Examining Associations (Cont’d)

• Illustrative Research Question(s) this Technique Can Answer– Are sales significantly affected by advertising

expenditures and price (where all three variables are measured in dollars)?

– What proportion of the variation in sales is accounted for by advertising and price? How sensitive are sales to changes in advertising and price?

Page 11: Chapter Fourteen Examining Associations: Correlation and Regression

Spearman Correlation Coefficient

A Spearman correlation coefficient is a measure of association between two sets of ranks

n

6 d2i

i =1rs = 1 - ----------------------------

n(n2 - 1)

di = the difference between the ith sample unit's ranks on the two variables  n = the total sample size

Page 12: Chapter Fourteen Examining Associations: Correlation and Regression

Example: Industrial Marketing Firm

• An industrial marketing firm has been hiring all its salespeople from among the graduates of 10 business schools in the vicinity of its headquarters

• The firm developed a subjective ranking of the perceived prestige levels of the 10 schools and the performance levels of the groups of graduates recruited from these schools

• Question– What is the degree of association between the prestige

levels of the schools and the sales performance levels of their graduates hired by this company?

Page 13: Chapter Fourteen Examining Associations: Correlation and Regression

Table 14.2 Association Between School Prestige and Performance of Graduates

Page 14: Chapter Fourteen Examining Associations: Correlation and Regression

• First step is to calculate the Spearman Correlation Coefficient.

• The result is = .661• The next step is to calculate the t-distrubtion

sr

1

(6)(56)= 1– .339 = .661

10 100 1sr

= -

Results

Page 15: Chapter Fourteen Examining Associations: Correlation and Regression

(6)(56) rs = 1 - ---------------- = .661

10(100 - 1)

Hypotheses

H0: s = 0

Ha: s 0

Spearman Correlation Co-efficient

Page 16: Chapter Fourteen Examining Associations: Correlation and Regression

n – 2t = rs ---------- = 2.49

1 - rs2

t - Distribution

• For = .05, – t for 8 degrees of freedom (d.f. = n - 2 = 10 - 2 = 8) – tc = +2.31 and -2.31

• Decision Rule: – “Reject H0 if t 2.31 or if t -2.31.”– Since t > 2.31, we reject H0 and conclude that there is

a true association between the prestige of business schools and the job performance of its graduates.In other words, the sample correlation of .661 is unlikely to have occurred because of chance.

Page 17: Chapter Fourteen Examining Associations: Correlation and Regression

The Pearson correlation coefficient is the degree of

association between variables that are interval-or ratio-scaled.

Pearson correlation coefficient (rxy) between them is given by

n = sample size (total number of data points)

X and Y = means

Xi and Yi = values for any sample unit i

sx and sy = standard deviations

n

i = 1 (Xi – X)(Yi – Y)

rxy = -----------------------------(n-1) sx sy

Pearson Correlation Coefficient

Page 18: Chapter Fourteen Examining Associations: Correlation and Regression

Table 14.3 Bright Detergent Data

Page 19: Chapter Fourteen Examining Associations: Correlation and Regression

Scatter Diagram

• Plot in a two-dimensional graph• Indicates how closely and in what fashion the

variables are associated

Page 20: Chapter Fourteen Examining Associations: Correlation and Regression

Exhibit 14.1 Scatter Diagram of Sales and Advertising Data

• What is the relationship between dollar sales and advertising expenditure ?

Page 21: Chapter Fourteen Examining Associations: Correlation and Regression

Exhibit 14.2 Scatter Diagram of Sales and Number of Competing Brands

• What is the relationship between dollar sales and number of competing detergents ?

Page 22: Chapter Fourteen Examining Associations: Correlation and Regression

Pearson Correlation

• Correlation between sales and advertising is .927

• Correlation between sales and number of competing brands is .910

Page 23: Chapter Fourteen Examining Associations: Correlation and Regression

Two-Tailed Hypothesis Test For Correlations

• H0: = 0;

• Ha: 0,

• For = .05, – 19 degrees of freedom(d.f.= n - 1 = 19)

– rc = + .433 and rc = -.433

• Decision rule is: – “Reject H0 if r .433 or if r -.433.”

• Reject H0 in both cases

Page 24: Chapter Fourteen Examining Associations: Correlation and Regression

Exhibit 14.3 Scatter Diagram Showing a Nonlinear Association Between Variables

Page 25: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company – Computing Pearson Correlation Among Service Quality Constructs

• National Insurance Company was interested in the correlations between respondents’ overall service-quality perceptions (on the 10-point scale) and their average ratings along each of the five dimensions of service quality

Page 26: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company – Computing Pearson Correlation Among Service Quality Constructs (Cont’d)

1. Click ANALYZE

2. Select CORRELATE

3. Select BIVARIATE

4. Move “oq, reliable, empathy, tangible, response, and assure” to VARIABLES box

5. Click OK

Page 27: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs (Cont’d)

Page 28: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company– Computing Pearson Correlation Among Service Quality Constructs Using SPSS

Page 29: Chapter Fourteen Examining Associations: Correlation and Regression

Interpreting Pearson Correlation Coefficients

• Each of the five service-quality measures (reliability, empathy, tangibles, responsiveness, and assurance) is significantly related to the overall quality (OQ) at the .001 level of significance

• Responsiveness has the strongest correlation (.8625) • Tangibles have the weakest correlation (.5038)• All the correlations are strong enough to be

meaningful

Page 30: Chapter Fourteen Examining Associations: Correlation and Regression

Simple Regression Analysis

• Generates a mathematical relationship (called the regression equation) between one variable designated as the dependent variable (Y) and another designated as the independent variable (X)

Page 31: Chapter Fourteen Examining Associations: Correlation and Regression

Independent Variable vs.Dependent Variable

• Independent variable– Explanatory or predictor variable

– Often presumed to be a cause of the other

• Dependent variable – Criterion Variable

– Influenced by the independent variable

Page 32: Chapter Fourteen Examining Associations: Correlation and Regression

Scenario: Curtis Construction Industry Lobbyist

• Curtis, a construction industry lobbyist, is in an area of the country that has a high unemployment rate and a number of economically depressed construction projects

• His current charge is to convince local government officials to vote in favor of several tax concessions for the construction industry

• He is wondering whether he can generate any concrete evidence to show that increased construction activity (presumably spurred by the proposed tax concessions) would greatly benefit the state

Page 33: Chapter Fourteen Examining Associations: Correlation and Regression

Scenario: Curtis Construction Industry Lobbyist (Cont’d)

• Possible Dependent Variable– Number of people unemployed or the

unemployment rate

– Data on this variable may be gathered from a sample of areas from around the country

• Possible Independent Variable– Number of construction permits issued or number

of ongoing construction projects

– Data on this variable should be gathered from the same sample

Page 34: Chapter Fourteen Examining Associations: Correlation and Regression

Scenario: Carol, Chief Librarian

• Carol, chief librarian in a major university, is eager to increase the number of students borrowing books from the library as well as the number of books borrowed per student

• She needs some persuasive evidence to show how increased borrowing of books might benefit students

Page 35: Chapter Fourteen Examining Associations: Correlation and Regression

Scenario: Carol, Chief Librarian (Cont’d)

• Possible Dependent Variable– Cumulative grade point ratio

– Data on this variable should be gathered for a sample of students who have borrowed books in the past

• Possible Independent Variable– Number of books borrowed

– Assuming that the library has records of the books borrowed by students, data on this variable can be obtained from those records for the same sample of students

Page 36: Chapter Fourteen Examining Associations: Correlation and Regression

Scenario: Jack, Trade Show Officer

• Jack, an officer in an association in charge of putting together and promoting industrial trade shows, is wondering about the impact of the number of exhibitors in a trade show on trade show attendance

Page 37: Chapter Fourteen Examining Associations: Correlation and Regression

Scenario: Jack, Trade Show Officer (Cont’d)

• Possible Dependent Variable– Number of people visiting a trade show– Data on this variable can be obtained for a

representative sample of trade shows from the association’s past records

• Possible Independent Variable– Number of exhibitors in a trade show– Necessary data can be obtained from the past

records

Page 38: Chapter Fourteen Examining Associations: Correlation and Regression

Deriving a Regression Equation

• Y = a + bX, where a and b are constants• Y-> Dependent Variable• x-> Independent Variable

Page 39: Chapter Fourteen Examining Associations: Correlation and Regression

Exhibit 14.4 Several Subjectivity Constructed Regression Lines

Page 40: Chapter Fourteen Examining Associations: Correlation and Regression

Regression Using SPSS –Sales and Advertising Data

1. Click ANALYZE 2. Select REGRESSION3. Click LINEAR4. Move “Dollar Sales for Bright” to

DEPENDENT box 5. Move “advertising expenditures for Bright” to

6. INDEPENDENT(S) box7. Click OK

Page 41: Chapter Fourteen Examining Associations: Correlation and Regression

Exhibit 14.5 SPSS Computer Output or Simple Regression

Analysis of Sales and Advertising Data

Page 42: Chapter Fourteen Examining Associations: Correlation and Regression

SSESy/x = -----------

n - k - 1

Standard Error

• The value of the standard error (sy/x) is shown in the computer output as 2.277, which is the square root of the error mean square value of 5.186

Page 43: Chapter Fourteen Examining Associations: Correlation and Regression

Practical Applications of Regression Equations

• The regression coefficient, or slope, can indicate how sensitive the dependent variable is to changes in the independent variable

• The regression equation is a forecasting tool for predicting the value of the dependent variable for a given value of the independent variable

Page 44: Chapter Fourteen Examining Associations: Correlation and Regression

Precautions In Using Regression Analysis

• Only capable of capturing linear associations between dependent and independent variables

• A significant R2-value does not necessarily imply a cause-and-effect association between the independent and dependent variables

• A regression equation may not yield a trustworthy prediction of the dependent variable when the value of the independent variable at which the prediction is desired is outside the range of values used in constructing the equation

Page 45: Chapter Fourteen Examining Associations: Correlation and Regression

Precautions In Using Regression Analysis (Cont’d)

• A regression equation based on relatively few data points cannot be trusted

• The ranges of data on the dependent and independent variables can affect the meaningfulness of a regression equation

Page 46: Chapter Fourteen Examining Associations: Correlation and Regression

Multiple Regression Analysis

• Yi = a + b1X1i + b2X2i + … + bkXki

• Yi is the predicted value of the dependent variable for some unit i;

• X1i, X2i, …, Xki are values on the independent variables for unit i;

• bl, b2, . . . , bk are the regression coefficients;

• a is the Y-intercept representing the prediction for Y when all independent variables are set to zero

Page 47: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company – Multiple Regression Using SPSS

• Jill and Tom were interested in conducting a multiple regression analysis wherein overall service quality perceptions is the dependent variable and the average ratings along the five dimensions are the indpendent variable

Page 48: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company – Multiple Regression Using SPSS (Cont’d)

1. Click ANALYZE 2. Select REGRESSION3. Click LINEAR4. Move “OQ” to DEPENDENT Box 5. Move “reliable, empathy, tangible, response, and

assure” to INDEPENDENT(S) box6. Click OK

Page 49: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company– Multiple Regression Using SPSS (Cont’d)

Page 50: Chapter Fourteen Examining Associations: Correlation and Regression

The R-square of .810 indicates a strong relationship between these variables and overall quality.

National Insurance Company– Multiple Regression Using SPSS (Cont’d)

Page 51: Chapter Fourteen Examining Associations: Correlation and Regression

National Insurance Company– Multiple Regression Using SPSS (Cont’d)

All variables except empathy are significantly related to overall service quality (as indicated by the t-test of significance in the far right column)

Page 52: Chapter Fourteen Examining Associations: Correlation and Regression

Bright Detergent Case – Multiple Regression Using SPSS

1. Click ANALYZE 2. Select REGRESSION3. Click LINEAR4. Move “Dollar Sales for Bright” to DEPENDENT Box 5. Move “advertising expenditures for Bright and

Number of competing Brands” to INDEPENDENT(S) box

6. Click OK.

Page 53: Chapter Fourteen Examining Associations: Correlation and Regression

Bright Detergent Case – Multiple Regression Using SPSS (Cont’d)

Model Summary

.934a .873 .858 2.23Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Number of CompetingDetergents, Advertising Expenditures for Bright ($in100)

a.

ANOVAb

580.373 2 290.187 58.293 .000a

84.627 17 4.978

665.000 19

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Number of Competing Detergents, Advertising Expendituresfor Bright ($in 100)

a.

Dependent Variable: Dollar Sales of Bright ($ in Thousands)b.

Page 54: Chapter Fourteen Examining Associations: Correlation and Regression

Coefficientsa

8.854 6.717 1.318 .205

.808 .324 .619 2.496 .023

-.498 .376 -.328 -1.324 .203

(Constant)

Advertising Expendituresfor Bright ($in 100)

Number of CompetingDetergents

Model1

B Std. Error

UnstandardizedCoefficients

Beta

Standardized

Coefficients

t Sig.

Dependent Variable: Dollar Sales of Bright ($ in Thousands)a.

Bright Detergent Case – Multiple Regression Using SPSS (Cont’d)

Page 55: Chapter Fourteen Examining Associations: Correlation and Regression

Multicollinearity

• Multicollinearity exists when independent variables in a multiple regression equation are highly correlated among themselves

Page 56: Chapter Fourteen Examining Associations: Correlation and Regression

Exhibit 14.9 SPSS Output for a Pair-wise Correlation Analysis of the Data in Table 14.3