eha: more on plots and interpreting hazards sociology 229a: event history analysis class 9 copyright...

37
EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission

Upload: alexina-bryan

Post on 12-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

EHA: More On Plots and Interpreting Hazards

Sociology 229A: Event History AnalysisClass 9

Copyright © 2008 by Evan SchoferDo not copy or distribute without permission

Page 2: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Announcements

• Final paper assignment due next week• Questions?

• Class topics: • More on interpreting hazard & cumulative hazard

functions• More multilevel models…

Page 3: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Plots: Smoothing

• Issue: Stata heavily smooths hazard plots• “Raw” hazard plots are very spiky… smoothing can

help with interpretation• Issue: Too much smoothing obscures the detail within

your data

– Simplest way to control smoothing: • Set the “width” of the kernel smoother in Stata• EX: sts graph, haz width(3)• Lower width = less smoothing; try different values.

Page 4: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Smoothing• Environmental Law Data: Default smoothing

0.0

2.0

4.0

6.0

8.1

1970 1980 1990 2000analysis time

Smoothed hazard estimate

Page 5: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Smoothing• Environmental Law Data: width (1)

0.0

5.1

.15

.2

1970 1980 1990 2000analysis time

Smoothed hazard estimate

Page 6: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Smoothing• Environmental Law Data: width (.2)

0.1

.2.3

1970 1980 1990 2000analysis time

Smoothed hazard estimate

Page 7: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Smoothing• Don’t make width too small!: width (.001)

010

2030

4050

1970 1980 1990 2000analysis time

Smoothed hazard estimateStata’s default smoother amplifies peaks in data if width is too small!

Page 8: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Smoothing: Remarks

• Stata default smoothing is quite aggressive• Obscures detail in your data

– Stata default smoothing “width” is ~4 in this case• Smoothing of 1-2 works much better

• In addition to removing detail, smoothing but lowers the peaks…

• Highest peak = .1 (width 4)• Highest peak = .3 (width .2)• Also: REALLY narrow width exaggerates peaks

– Hightest peak = 50 (width .0001)

Page 9: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Survival Plot Problem: noorigin• Issue: Stata always likes to include t=0…

0.00

0.25

0.50

0.75

1.00

0 500 1000 1500 2000analysis time

Kaplan-Meier survival estimate

Page 10: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Survival Plot Problem: noorigin• Solution: sts graph, noorigin

0.00

0.25

0.50

0.75

1.00

1970 1980 1990 2000analysis time

Kaplan-Meier survival estimate

Page 11: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Plots: Confidence Intervals

• Confidence intervals are a good idea• Especially useful when comparing groups

– Stata• sts graph, ci• sts graph, haz ci

– Issue: Adding CIs tends to compress the Y axis to make room for the confidence bands

• Makes the hazard look less variable over time• Watch for that…

– Issue: CIs can make charts “busy” / hard to read.

Page 12: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Plot with 95% CI0

.1.2

.3

1970 1980 1990 2000analysis time

95% CI Smoothed hazard function

Smoothed hazard estimate

Page 13: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard plot with 95% CI0

.1.2

.3.4

1970 1980 1990 2000analysis time

95% CI 95% CIwest2 = 0 west2 = 1

Smoothed hazard estimates

Page 14: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Survivor plot with 95% CI0

.25

.5.7

51

1970 1980 1990 2000analysis time

95% CI 95% CIwest2 = 0 west2 = 1

Kaplan-Meier survival estimates

Page 15: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Other sts graph options

• Options to show # of lost, entered, or censored cases

• Lost: puts a number above plots showing cases lost• Atrisk: shows # of cases at risk

– Actually, it shows risk per interval– EX: if unit = nation, it shows nation-years in an interval

• Censored: shows number of cases censored

Page 16: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Sts graph: atrisk

8684

16581 166 247 82 81 80

7874 73 72

6866

63 6262

5755 53

4540

3330

2420

0.00

0.25

0.50

0.75

1.00

1970 1980 1990 2000analysis time

Kaplan-Meier survival estimate

Page 17: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Interpreting Hazard & Cumulative Haz

• The survivor plot has a clear interpretation: The proportion of cases that have not experienced the event

• Assuming non-repeated events– If events repeat frequently, survivor falls to 0, stays there…

• Assuming the risk-set stays more-or-less constant– Survivor never goes back up, even if more cases enter the risk

set…

• But, hazard rates & cumulative hazard rates are harder to understand intuitively…

• So, I made some illustrative examples

Page 18: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Hazard Example 1

• Start with 10 people• Let’s put them in the risk set sequentially• All cases start at time t=0• One case fails at each point in time

Start End Failed?0 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 10 1

Page 19: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 1: Survivor Plot0.

000.

250.

500.

751.

00

0 2 4 6 8 10analysis time

Kaplan-Meier survival estimate

Page 20: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 1: Hazard Plot.1

.2.3

.4.5

0 2 4 6 8 10analysis time

Smoothed hazard estimate

Events occur at an even interval… but rate goes up because the risk set dwindles…

Page 21: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 1: Integrated Hazard0.

001.

002.

003.

00

0 2 4 6 8 10analysis time

Nelson-Aalen cumulative hazard estimate

Page 22: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 2

• Let’s figure out what’s really going on…• Again, start with 10 people• Imagine each enters the risk set sequentially, and fails

after 1 time unit– So only 1 case at risk in any period of time– And, 1 event per each point in time

Start End Failed?0 1 11 2 12 3 13 4 14 5 15 6 16 7 17 8 18 9 19 10 1

Page 23: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 2: Survivor Plot0.

000.

250.

500.

751.

00

0 2 4 6 8 10analysis time

Kaplan-Meier survival estimate

Survivor drops to zero when first case fails… doesn’t go back up when additional cases enter

NOT very informative…

Page 24: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 2: Hazard Plot.9

.92

.94

.96

.98

1

0 2 4 6 8 10analysis time

Smoothed hazard estimate

Hazard basically sits at 1.0. Variations = due to smoothing issues…

That’s because for every time unit at risk there is event

Page 25: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Interpreting Hazards

• Let’s run an exponential model• We’ll estimate the constant only… the baseline hazard

. streg , dist(exponential) nohr

Exponential regression -- log relative-hazard form

No. of subjects = 10 Number of obs = 10No. of failures = 10Time at risk = 10 LR chi2(0) = 0.00Log likelihood = 5.1044126 Prob > chi2 = .

------------------------------------------------------------------------------ _t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- _cons | 0 .3162278 0.00 1.000 -.619795 .619795------------------------------------------------------------------------------

Why is the base rate zero?

Answer: We need to exponentiate!

Exp(0) = 1

The model estimates the baseline hazard to be 1.0!

Page 26: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 2: Integrated Hazard0.

002.

004.

006.

008.

0010

.00

0 2 4 6 8 10analysis time

Nelson-Aalen cumulative hazard estimate

Integrated Hazard reaches 10

Same number of events as previous example… but less time-at-risk… so overall cumulated risk was higher

Page 27: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 3

• Let’s keep those same cases but add 10 more• Each in risk for 1 time-unit; all of which are censored

Start End Failed?0 1 11 2 12 3 13 4 14 5 15 6 16 7 17 8 18 9 19 10 10 1 01 2 02 3 03 4 04 5 05 6 06 7 07 8 08 9 09 10 0

Page 28: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 3: Hazard Plot.4

5.4

6.4

7.4

8.4

9.5

0 2 4 6 8 10analysis time

Smoothed hazard estimate

The risk set is doubled, but # events stays the same…

So, hazard drops by half… to .5

Page 29: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Interpreting Hazards

• Let’s run an exponential model• We’ll estimate the constant only… the baseline hazard

. streg , dist(exponential) nohr

Exponential regression -- log relative-hazard form

No. of subjects = 20 Number of obs = 20No. of failures = 10Time at risk = 20 LR chi2(0) = 0.00Log likelihood = -1.8270592 Prob > chi2 = .

------------------------------------------------------------------------------ _t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- _cons | -.6931472 .3162278 -2.19 0.028 -1.312942 -.0733521------------------------------------------------------------------------------

Exp(-.693) = .5

The baseline hazard rate is .5…

Page 30: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 3: Integrated Hazard0.

001.

002.

003.

004.

005.

00

0 2 4 6 8 10analysis time

Nelson-Aalen cumulative hazard estimate

Likewise, integrated hazard is only half as big…

Page 31: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 4

• What about when events occur in clumps?• Example: two dense clusters of events

– Between times 1-2 and 4-5

Start End Failed?0 1 10 1.25 10 1.5 10 1.75 10 2 10 3 10 4 10 4.25 10 4.5 10 4.75 10 5 1

Page 32: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 4: Survivor Plot0.

000.

250.

500.

751.

00

0 1 2 3 4 5analysis time

Kaplan-Meier survival estimate

Here we see the two “clumps” of events…

Page 33: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 4: Hazard Plot.2

.4.6

.8

1 2 3 4 5analysis time

Smoothed hazard estimate

Second “clump” has much higher hazard because the risk set is much smaller…

Default smoothing pretty much wipes out the first clump

Page 34: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 4: Hazard Plot, less smoothing

0.5

11.

52

1 2 3 4 5analysis time

Smoothed hazard estimate

Hazard with “width(.3)”

Now both clumps of events are clearly visible…

Page 35: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Example 4: Integrated Hazard0.

001.

002.

003.

00

1 2 3 4 5analysis time

Nelson-Aalen cumulative hazard estimate

Note how events with small risk set affect the cumulative hazard more (2nd clump)…

Page 36: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Interpreting Hazards

• The hazard rate reflects the rate of events per unit time at risk

• A constant hazard of .1 for one time-unit means that 10% of at-risk cases will have events

– But, things are often more complex than that when hazards are computed in continuous time

• The rate may vary within the interval depending on how the events are concentrated

• The risk set may change over the interval… esp. if cases leave the risk set.

Page 37: EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without

Interpreting Integrated Hazards

• Integrated hazards represent the total amount of risk that has accumulated

• If the hazard is constant at .1, the integrated hazard would reach 10 after one hundred time-units…