copyright sharyn o'halloran1 statistics and quantitative analysis u4320 lecture 11 : path...

53
Copyright Sharyn O'Hallor an 1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL: http://www.columbia.edu/itc/sipa/U4320y-003/

Upload: myrtle-bishop

Post on 31-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 1

Statistics and Quantitative Analysis U4320

Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran

URL: http://www.columbia.edu/itc/sipa/U4320y-003/

Page 2: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 2

Key Points Review Regression in Excel Slope Coefficient as a

Multiplication Factor Path Diagram and Causal Models Direct and Indirect Effects

Page 3: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 3

Regression in Excel Example:

Manatees are large gentle sea creatures that live along the Florida coast.

Many Manatees are killed or injured by powerboats each year.

The US Fish and Wildlife Service conducted a study on the impact on registration permits and number of Manatees killed.

Number of Powerboats

ManateeDeaths

Page 4: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 4

Regression in Excel

Year Powerboat

registration (1000)Manatees

Killed

1977 447 131978 460 211979 481 241980 498 161981 513 241982 512 201983 526 151984 559 341985 585 331986 614 331987 645 391988 675 431989 711 501990 719 471991 716 531992 716 381993 716 351994 735 49

Powerboat registration (1000)

Manatees Killed

Descriptive StatisticsMean 601.56 Mean 32.61Standard Error 24.46 Standard Error 3.02Median 599.50 Median 33.50Mode 716.00 Mode 24.00Standard Deviation 103.79 Standard Deviation 12.82Sample Variance 10773.32 Sample Variance 164.25Range 288.00 Range 40.00Minimum 447.00 Minimum 13.00Maximum 735.00 Maximum 53.00Sum 10828.00 Sum 587.00Count 18.00 Count 18.00Confidence Level(95.0%) 51.62

Confidence Level(95.0%) 6.37

These are the data collected:

Does the number of Registered Powerboats increase the number of Manatees killed?

Page 5: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 5

Regression in Excel(cont.)

Relation between Powerboat Registrtion (1000) and Manatee Deaths

-40

-30

-20

-10

0

10

20

30

40

50

60

-100 0 100 200 300 400 500 600 700 800

Manatee Data

Reg

istr

atio

nManatees Killed

Coefficients Standard Error t Stat P-value

Intercept -35.18 7.70 -4.57 0.000314Powerboat registration (1000) 0.11 0.01 8.93 0.000000

For each additional 1000 powerboats registered, we expect an increase of .11 Manatee Deaths.

*

1

* 938

110

574

1835ˆ).(

X.

).(-

.-Y

*Note: t-statistics in parentheses. * indicates p-value <0.05

Graph Data:

Page 6: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 6

Regression in Excel(cont.)

Hypothesis TestingH0: = 0

Ha: 0

Calculate a 95% Confidence Interval

Reject or Fail to Reject Null Hypothesis Therefore, we reject the null hypothesis that b1=0

in favor of the alternative that it is not equal to 0.

b

kn SEtb *1

025.

0.01*12.20.11

0.002120.11

0.11212 0.10788

Page 7: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 7

Regression in Excel(cont.)

Interpretation Is the count of registered powerboats

related to the number of Manatees kill? Yes, each additional 1000 powerboat

registration is associated with an additional .11 Manatee deaths.

If the Fish and Wildlife Service limited powerboat registration to 700, what would the expected number of deaths of Manatee?

What if no powerboats were allowed?

82.41)700(1101835ˆ ..Y

18.35)0(1101835ˆ ..Y

Page 8: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 8

Regression in Excel(cont.)

Policy Prescriptions?? What are some of the costs associated

with limiting powerboat registration? What are some of the benefits? Should powerboats be prohibited? If we want to maintain the current

population of Manatees, what level of registration would be allowed?

What additional data would be necessary to make this decision?

Page 9: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 9

Interpretation: Regression Coefficients as Multiplication Factors

Simple Regression Basic Equation

Remember our basic one variable regression equation is:

b is the slope of the regression line. It represents the change in Y corresponding

to a unit change in X.

Y = a + bX

Page 10: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 10

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

Multiplication Factor We can also think of b as a multiplication factor.

Example Take the first fertilizer equation:

Say we add 5 more pounds of fertilizer. Then the change in yield according to this equation will be:

Y = b X

Y = 36 + .06 X

Y = b XY = .06

Y = .30 bushels

Page 11: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 11

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

Multiple Regression: "Other Things Being Equal" Now consider the multiple regression

equation:

We can still think of the slopes as multiplication factors.

But now they are multiplication factors if we change only one variable and keep all others constant.

Y = b0 + b

1X

1 + b

2X

2

Page 12: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 12

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

Say we change X1 to (X1 + X1) Then we can write:

If X1 changes while all others remain constant, then change in

Y = b1(change in X1)

(new)

(initial)

])([22

22

111

11

0

0

Xb

Xb

XXb

Xb

b

b

Y

Y

Y

11XbY

Page 13: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 13

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

Examples Let's try an example. Say we have the following single and

multiple regression equations:

YIELD = 36 + .059 FERTILIZER

YIELD = 30 + 1.50 RAIN

YIELD = 28 + .038 FERT + .83 RAIN.

Bivariate gives total effect of FertilizerBivariate gives total effect of Rain

Multivariate gives the partial effect of fertilizer and rain

Page 14: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 14

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

What is the change in yield if a farmer adds another 100 pounds of fertilizer? Answer:

Only the fertilizer will change, not the rain.

So use the multiple regression equation:

Y = b1 X1

Y = 100 (.038)Y = 3.8 bushels

Page 15: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 15

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

What is the expected change in yield if a farmer irrigates his fields with 3 inches of water? Answer:

Only the amount of water will change, not the fertilizer.

So use the multiple regression equation:

Y = b2 X2

Y = 3 (.83)Y = 2.5 bushels

Page 16: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 16

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

Say the farmer adds both 100 pounds of fertilizer and 3 inches of irrigation. Now what will the difference in yield be?

Answer: The change in yield will reflect the changes in

both independent variables:

bushels 6.3 Y

2.5 3.8 Y

(3) (0.83) (100) 0.38 Y

X b X b Y 2211

Page 17: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 17

Interpretation: Regression Coefficients as Multiplication Factors (cont.)

Now, say rainfall has increased 3 inches and we know that fertilizer is not held constant.

What would your best guess be as to the difference in yield?

Answer: Since fertilizer is not held constant, we should use the single regression equation:

bushels. 4.5 Y

(1.5) 3 Y

X b Y

Page 18: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 18

Path Analysis Purpose:

Develop a technique that allows us to disaggregate the effects caused directly by the increase in rainfall and indirectly by other factors.

This is known as Path Analysis. A path analysis is an ordered causal system that relates

the effect that a change in X produces on Y. This can be directly:

Employment is a cause of earnings—People who get (lose jobs increase (decrease) their earnings.

Race is a cause of party identification—Blacks are more likely to become Democrats than are whites.

This can also be indirectly: Employment causes earnings via education—People who

higher education get jobs are therefore have higher earnings.

Race causes party via income—Blacks tend to have lower income and therefore are more likely to become Democrats than are whites.

Employment Earnings

Employment Earnings

Education

Page 19: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 19

Path Analysis (cont.)

Example 1: Fiji Women Say we have data on 4700 women from

Fiji. Basic Model

We know for each woman: Age Years of education, and Number children

                      

Page 20: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 20

Path Analysis (cont.)

Path Diagram We might think that a woman's age and education

correlate with how many children she has. We can write a causal model that looks like this:

Age

Education

Children

Page 21: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 21

Path Analysis (cont.)

Estimates When we estimate these relationships,

we get the results: CHILDREN = 3.4 + .059 AGE - .16 EDUC

We can represent these results as follows: Age

Education

Children

0.059

-0.16

Page 22: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 22

Path Analysis (cont.)

Direct and Indirect effects Now, let's say we think there might

also be a relationship between a woman's age and education.

Estimated Equation If we estimate this regression, we get

the result: EDUCATION = 7.6 - .032

AGE. Older women have less education than

younger women.

Page 23: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 23

Path Analysis (cont.)

Path Diagram We now add this new information into

the causal model:

Age

Education

Children

0.059

-0.16

-0.032

What is the change in the expected number of children due to 1 extra year, holding education constant?

What is the change in the years of education from this same 1 extra year of age?

Page 24: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 24

Path Analysis (cont.)

Direct and Indirect Effects Question:

What's the change in number of children from one extra year of age, letting education change too?

The change in age has two effects: a direct and an indirect effect.

Direct Effect (Multiple regression coefficient) The direct effect is captured in the

coefficient leading from AGE to CHILDREN. This is the multiple regression coefficient,

and it represents the expected extra number of children from one extra year, holding education constant

Age

Children

0.059

Page 25: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 25

Path Analysis (cont.)

Indirect Effect We know that an extra year corresponds with

-.032 years of school. Each extra year of school corresponds with -.16

extra children. We get the indirect effect by multiplying

along the arrows leading from AGE to CHILDREN through EDUCATION:

-0.16

Age

Education

Children-0.032 (-.032) * (-.16) = + .005

Page 26: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 26

Path Analysis (cont.)

Total Effect So the total effect of AGE on CHILDREN

letting EDUCATION vary too is the sum of the direct and indirect effects.

Question: What do you think would have happened if we

ran a simple regression of CHILDREN on AGE? What would the coefficient have been?

Age

Education

Children

0.059

-0.16

-0.032.059 + .005 = .064.

Page 27: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 27

Path Analysis: In a Nutshell

A path is a route from Xi to Xj

Paths follow one-way arrows. Paths have signs:

They may be positive, negative, or zero. Paths have sizes or magnitudes:

Numbers that summarize the total impact of Xj of a unit change in Xi after it has rippled through the system.

Page 28: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 28

Path Analysis: In a Nutshell Illustration

Assume: A Year’s experience raises income $500 Each $ income raises one’s measure of conservatism by .025 A point on the conservative scale raises the percentage who

vote Republican by .137 percentage points What happens if workers gain 5 years of experience?

Incomes increase by (5X$500)=2500 A $2,500 increase in income increase their conservatism 62.5

points ($2500X.025=62.5) The 62.5 point increase in conservatism increases

Republicanism 8.56 points (62.3X.137=8.56). Multiplication Rule

One year increase in experience would produce a 8.56/5=1.71 shift.

Which could be found by multiplying the 3 coefficients together $500X.025X.137 =1.7125

Page 29: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 29

Path Analysis (cont.)

Summary The value of a path (the change in Xj per change in

Xi) is found by multiplying the coefficients of each arrow in the path.

Regardless of the measurement units of the intervening variables, the result will come out in Xj units per one unit difference in Xi.

A path diagram provides insight into the relationship between simple and multiple regression.

Multiple regression gives us the direct (partial) effects of the independent variables on the dependent variable holding all else constant.

Simple regression gives us the total effect, which is the sum of the direct and the indirect effects.

Page 30: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 30

Path Analysis (cont.)

The total relation of Y to X1 is equal to the direct plus indirect relation.

(Simple regression coefficient) Total Effects= Direct + Indirect = b1 + bb2

22110ˆ XbXbbY

212

2

111 xxbxbyx

If we divide this equation by

2

1

21

212

1

1

x

xxbb

x

yx

2

1x

Regression coefficient of X2 against X1, which we denote b.Regression

coefficient of Y against X1, that is the total relation.

Page 31: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 31

Path Analysis: Example 2(cont.)

Example 2: Brady, Cooper and Hurley Defining Unity

Party unity scores are calculated as: (% voting in the majority - % voting in the minority)

Building the Causal Model Two components to party unity: internal and

external factors. So we can write a causal model like this:

External

Internal

Unity

External factors define how homogeneous is the constituent base of the party.Internal factors have to do with the strength of party leadership

Page 32: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 32

Path Analysis: Example 2(cont.)

However, it is also thought that external factors influence internal factors.

That is, when legislators from a party are united on the issues, they are more likely to give their leaders power to get things done.

Thus we add another line to our model:

External

Internal

Unity

Page 33: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 33

Path Analysis: Example 2(cont.)

Results When this model was estimated, the

results were:PARTY STRENGTH = .61 INTERNAL + .58 EXTERNAL;

INTERNAL = .66 EXTERNAL.

Question What is the effect of External factors on

Party Unity? Direct Effect = 0.58 Indirect Effect = (.66)*(.61) = 0.40 Total Effect = .58 + .40 = .98

Page 34: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 34

Path Analysis: Example 3(cont.)

Example 3: Commie Model from Shapiro What determines people's attitudes towards

whether communists should be allowed to teach college?

Hypothesis Our hypothesis is that attitudes towards

teaching depend on attitudes towards communism in general, party ID, education, and age.

Page 35: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 35

Path Analysis: Example 3(cont.)

Define the Variables First define and recode all variables:

TeachCom is a dichotomous variable, coded 1 if the respondent thought it was OK for communists to teach college.

Smarts is years of education. PartyOn is the respondent's party ID.

0 stands for strongly Democrat, 6 for strongly Republican.

ComPhile is how you think about communism as a system of government.

Higher values mean that it's a good system. Years is your age.

Page 36: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 36

Path Analysis: Example 3(cont.)

Second, Report Descriptive Statistics Means gives the mean of each variable. Stddev gives their standard deviation. N gives the number of valid observations. Corr gives the correlations between variables. Sig tells us the significance of each correlation.

Variable Mean Standard Deviation Years 45.86 18.021 Smarts 12.843 3.005 Partyon 2.84 2.119 Comphile 1.68 0.758 Teachcom 0.541 0.499 Number of Cases = 932

Page 37: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 37

Path Analysis: Example 3(cont.)

Third, Specify Causal Model Attitudes towards teaching are determined by all

the other variables. Attitude towards the communist system depends

on party ID, education, and age. Party ID depends on education and age. Finally, education depends on age.

Smarts

PartyOn

TeachComYears ComPhile

Page 38: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 38

Path Analysis: Example 3(cont.)

Note: How to Build a Causal Model First of all, what constitutes a valid causal

model? For now, the answer is: no cycles. That is, you shouldn't be able to start at a point

and follow arrows and end up back at the same point.

Second, once you have a causal model, how do you know which regressions to run?

For each variable, see what arrows are going into it.

Then run a regression with those variables as the independent variables.

Page 39: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 39

Path Analysis: Example 3(cont.)

Fourth, Estimate the Regression Model

Regression commands How we specify our causal model determines

what regression we run. For instance, TeachCom has arrows going into

it from all other variables, so we run the regression with all the variables.

Then we take ComPhile, and regress it on Years, Smarts and PartyOn.

And so on down the line.

Page 40: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 40

Path Analysis: Example 3(cont.)

Correlation Matrix Smarts is negatively correlated with years.

That means that older people tend to have had fewer years of schooling.

PartyOn is negatively correlated with years So older people tend to be Republican.

Age is Negatively correlated with ComPhile and Teachcom

Older people tend to have more negative attitudes toward the communist system and be against communists teaching college.

Note: One-tailed p-values are reported beneath the

correlation coefficient. These simply report bivariate relations.

Page 41: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 41

Path Analysis: Example 3(cont.)

Regression Results

-.0047

-.006

-.044

.053

.032

-.029-.015

Smarts

PartyOn

TeachComYears ComPhile

-.003

.035

.16

Page 42: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 42

Path Analysis: Example 3 (cont.)

What is the effect of Years on Teachcom? Indirect Effect via Comphile Indirect Effect via Partyon

Partyon alone (-.0047)*(-.015) =.0000705

Partyon and Comphile (-.0047)(-.029)(.016) =.0000218

Indirect Effect via Smarts

Smarts alone (-.044)(.035) = .00154

Smarts & PartyOn (-.044)(.053)(-.015) = .000035 Smarts & ComPhile (-.044)(.032)(.16) = .000023 Smarts, Partyon & ComPhile

(-.044)(.053)(-.029)(.16) = .000011

Total Indirect Effects =-.00259

Page 43: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 43

Path Analysis: Example 3 (cont.)

The Total Effect of Age on People’s attitudes toward teachers having communist beliefs.

Total Effects = Direct + Indirect

.0056.0026--.003Effect Total

.0026EffectIndirect Total

Page 44: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 44

Path Analysis: Example 3 (cont.)

Five, Interpretation: For each additional year, peoples’ attitudes

toward whether it is appropriate for a communist to teach decreases by .0056 units.

.003 of those units come from the direct impact of just older people having different views.

.0026 of those units come from the indirect impact of

age via education level, Partisan identification, and What one thinks of communism as a system of

government.

Page 45: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 45

Homework Recap

Your homework assignment is to specify a Path Diagram.

Issues in the Article There is a dispute between American

and European researchers on the effectiveness of AZT.

Americans say that it works, and Europeans say that there's not enough evidence.

Page 46: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 46

Homework The U.S. View

The U.S. allowed AZT to be distributed to HIV-positive individuals on the basis of a study completed in 1989.

Usually the FDA requires that to release a drug the experimenters show:

Instead of direct link, researchers showed an indirect link.

If both of these correlations are positive, then so should be the total effect from AZT to health.

Drug HealthPositivePositive

AZT Markers HealthPositivePositive PositivePositive

Positive ?Positive ?

Page 47: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 47

Homework The European View

The European researchers said that although it's true that AZT raised the level of CD-4 markers, these markers didn't indicate any long-term improvement in health.

So they say that the model looks like this:

If there's no link between CD-4 and health, then the overall link between AZT and health is also 0 on the basis of the information presented so far.

AZT Markers Health

No effect No effect

PositivePositive No effect No effect

Page 48: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 48

Homework How to Resolve the Dispute

What kind of evidence would they need to resolve this dispute?

First, they could do studies to show that AZT has a direct effect on health. These studies take longer, but their conclusions are more reliable since they show a direct link.

Or, they could find another marker. That is, another intermediate substance that AZT affects and that affects health.

Page 49: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 49

Notes on Multiple Regression

22110

2

222112

212

2

111

3

2

1

XbXbYb

xbxxbyx

xxbxbyx

YYy

XXx

XXx

where

222

111

,

We apply the OLS criteria Minimize Y Y d 2 2

Yields 3 normal equations

22110ˆ XbXbbY

General Linear Model Takes into account not only the impact of X1 on Y but also the interaction between X1 and X2

Page 50: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 50

Notes on Multiple Regression(cont.)

nnnn 22110i

XbXbbY

22110XbXbbY

Which simplifies to:

Summing all the observations and dividing by n yields:

22110 XbXbYb

Deriving the formula for b0:1

2

3

22110XbXbbY

Page 51: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 51

Notes on Multiple Regression(cont.)

Deriving the formula for b1:

212

2

11101XXbXbXbYX

1. Multiply both sides of Eq. 1 by X1

2. Summing all the observations and dividing by n:

n

XXb

n

XbXb

n

XXb

n

Xb

n

Xb

n

YX

21

2

2

1

110

212

2

11101 )(

3. Multiply Equation (2) by : X

4

4. Subtract Equation (5) from Equation (4):

5

22

21

2

2

2

1

2

2

1

2

1

11

1XX

bX

bYX

XXn

Xn

YXn

122

2

11101XXbXbXbXY

Page 52: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 52

Notes on Multiple Regression(cont.)

n

XXn

nn

Xn

n

YXn

n

22

21

2

2

2

1

2

2

1

2

1

1

11XX

bX

bYX

212

2

111 xxbxbyx

In general,

2

1

2

1

2

2

1

2

2

11

2

1

22

1

2

)(2

2

2

XnXx

XnXnXX

XnXXX

XXXXXX

n

ii

n

ii

n

ii

n

i

n

i

n

i

n

ii

ii

ii

Therefore we can rewrite:

Page 53: Copyright Sharyn O'Halloran1 Statistics and Quantitative Analysis U4320 Lecture 11 : Path Diagrams Prof. Sharyn O’Halloran URL:

Copyright Sharyn O'Halloran 53

Path Analysis: Example 3 (cont.)

-.00966)(-.006)(.1alone ComPhile

.0000218.029)(.16)(-.0047)(-ComPhile andPartyOn

.0000705.015)(-.0047)(-aloneParty

.000011 .16)3)(-.029)(-.044)(.05 Comphile &Partyon Smarts,

.000023- 32)(.16)(-.044)(.0 Comphile & Smarts

.000035 53)(-.015)(-.044)(.0 Partyon & Smarts

.00154- 35)(-.044)(.0alone Smarts

leVia ComPhi

nVia PartyO

Via Smarts

.0056.0026--.003Effect Total

.0026EffectIndirect Total