using the population attributable fraction (paf) to assess mch population outcomes deborah...
TRANSCRIPT
Using the Population Attributable Fraction (PAF) to Assess MCH Population Outcomes
Deborah Rosenberg, PhD and Kristin Rankin, PhD
Epidemiology and Biostatistics
School of Public Health
University of Illinois at Chicago
Day One: 8:30-12:00
Background and Overview Basic Formulas and Initial Computations Moving Beyond Crude PAFs: Organizing
Multiple Factors into a Risk System Summary, Component and “Adjusted” PAFs
3
Background Epidemiologists most commonly use ratio measures
to estimate the magnitude of an association between a risk factor and an outcome
Impact measures, such as the Population Attributable Fraction (PAF), account for both the magnitude of association and the prevalence of risk in the population
PAFs are underused because of methodological concerns about how to appropriately account for the multifactorial nature of risk factors in the population
4
Background In a multivariable context, the goal is to generate a
PAF for each of multiple factors, taking into account relationships among the factors
Generating mutually exclusive and mutually adjusted PAFs is not straightforward given the overlapping distributions of exposure in the population; therefore methods that go beyond usual adjustment procedures are required
With appropriate methods, the PAF can be a tool for program planning and priority setting in public health since, unlike ratio measures, it permits sorting of risk factors according to their impact on an outcome
5
Historical Highlights
Levin’s PAF (1953) “Indicated maximum proportion of disease attributable
to a specific exposure” If an exposure is completely eliminated, then the
disease experience of all individuals would be the same as that of the “unexposed”
P(E) = prevalence of the exposure in the population as a whole p0 = prevalence of the outcome in the population as a whole p2 = prevalence of the outcome in the unexposed
0
20
p
pp
1)1RR(*)E(P
)1RR(*)E(PcrudePAF
6
Historical Highlights
Miettenin (1974) Adjusted PAF = Proportion of the disease that could be
reduced by eliminating one risk factor, after controlling for others factors and accounting for effect modification
Bruzzi (1985)/Greenland and Drescher (1993) Summary PAF = Proportion of the disease that could be
reduced by simultaneously eliminating multiple risk factors from the population
Method for using regression modeling to generate PAFs
Benichou and Gail (1990) Variance estimates for the adjusted and summary PAF
based on the delta method
7
Summary PAF = 0.457
Components of a risk system:
complete crossclassification of factors
0.0230.068
0.043
0.031
0.234
0.035
0.023
0.543
Factors A,B and C
Factor A Alone
Factor A and BFactor A and C
Factor B Alone
Factor B and C
Factor C AloneUnknown/Unexplained
Example: Summary PAF for Three Risk Factors for a Health Outcome
8
Apportioning the Summary PAF
The complete crossclassification of factors is not satisfactory because it fails to provide an overall estimate of impact for each risk factor.
Methodological work has been and is still being carried out to develop approaches that apportion the Summary PAF in a way that yields estimates of impact for each of a set of risk factors
9
Apportioning the Summary PAFEide and Gefeller (1995/1998)
Sequential PAF = Proportion of the disease that could be reduced by eliminating one risk factor from the population after some factors have already been eliminated
First Sequential PAF = the “adjusted PAF” —the particular sequential PAF in which a risk factor is eliminated first before any other factors
10
Apportioning the Summary PAF
Ordering is imposed for eliminating risk factors from the population, while simultaneously controlling for all other factors in the model
EXAMPLE (Sequence #1):Eliminate A, then B, then C
Sequential PAF* (A) = (A|B, C) Sequential PAF (B) = (A U B|C) – (A|B, C) Sequential PAF (C) = (A U B U C) – (A U B|
C)
*First Sequential or “adjusted” PAF
11
Summary PAF Apportioned into Sequential PAFs for Sequence #1
Eliminate A, then B, then C
Factor C=0.023
0.543
Factors A,B and C
Factor A Alone
Factor A and BFactor A and C
Factor B Alone
Factor B and C
Factor C AloneUnknown/Unexplained
Factor A=0.323
Factor B=0.111
12
Apportioning the Summary PAFEide and Gefeller (1995/1998)
Average PAF = Simple average of all sequential PAFs
Equal apportionment of risk over every possible sequence (removal orderings), since the order in which risk factors will be eliminated in the “real world” is an unknown
Based on the Shapley-solution in Game Theory Method of fairly distributing the total profit gained
by team members working in coalitions
13
Apportioning the Summary PAF:The Average PAFSix Sequences for Three Risk Factors
Sequence #1: Eliminate A, then B, then CSequence #2: Eliminate A, then C, then BSequence #3: Eliminate B, then A, then CSequence #4: Eliminate B, then C, then ASequence #5: Eliminate C, then A, then BSequence #6: Eliminate C, then B, then A
There are a total of 6 sequential PAFs for each of the three risk factors. The Average PAF for each factor, then, is the simple average of all 6.
14
Summary PAF Apportioned into Average PAFs for Three Risk Factors
0.290
0.090
0.543
0.078
Factor A
Factor B
Factor C
Unknown/Unexplained
15
The Summary PAF: the Basis for Producing Multifactorial PAFsThe Summary PAF can be apportioned into:
component PAFs reflecting every possible combination of factors being considered
sequential PAFs reflecting pieces of one particular sequence in which risk factors might be eliminated
average PAFs reflecting estimates of the impact of eliminating multiple risk factors regardless of the order in which each is eliminated
16
PAFs from Different Study DesignsCross-sectional:
Prevalence and measure of effect estimated from same data source
Interpretation: Proportion of prevalent cases that can be attributed to exposure
Cohort: Prevalence and measure of effect estimated from same
data source Interpretation: Proportion of incident cases that can be
attributed to exposure Case-Control:
Prevalence of exposure among the cases must be used and the OR in place of the RR, using the rare disease assumption
Interpretation: Proportion of incident cases that can be attributed to exposure
17
Methodological Issues for the PAFin a Multivariable ContextIn addition to different computational approaches, decisions about how variables will be considered may be different when focusing on the PAF as compared with focusing on the ratio measures of association
Differentiating the handling of modifiable and unmodifiable factors
Confounding and effect modification Handling factors in a causal pathway
18
Analytic Considerations
Variable Selection
Modifiability Unmodifiable factors are only used as potential
confounders or effect modifiers; PAFs not calculated Modifiable factors are factors that can possibly be
altered with clear intervention strategies
Classification of risk factors as unmodifiable or modifiable depends on perspective and may alter results
19
Analytic ConsiderationsModel Building
Differential handling of unmodifiable and modifiable factors Levels of measurement Coding choices Effect modification
– within modifiable factors– across modifiable and unmodifiable factors– within unmodifiable factors
Selection of a final model may not be based on statistical significance of the ratio measure of effect
Stratified models Defining the “significance” of PAFs
20
Analytic ConsiderationsPresentation and Interpretation
Average PAFs allow for the sorting of modifiable risk factors according to the potential impact of risk factor reduction strategies on an outcome in the population; Ratio measures only provide the magnitude of the association between a risk factor and a disease
The PAF is the proportion of an outcome that could be reduced if a risk factor is completely eliminated in the population – take care not to over-interpret findings
21
Analytic Considerations
So, why isn’t the multifactorial PAF used more commonly in the analysis of public health data?
No known standard statistical packages to complete all of the steps
Variance estimates for the average PAF are not yet available, either for random samples or for samples from complex designs
Currently, can only report 95% confidence intervals around crude, summary, and first sequential (adjusted) PAFs
While the interpretation of average PAFs is strengthened by evidence of causality, an average PAF cannot itself establish causality
22
Analytic Considerations
As always, having an explicit conceptual framework / logic model is important for multivariable analysis
Conceptualization is particularly critical when producing PAFs because decisions about variable handling and model building will determine the computational steps as well as influencing the substantive interpretation of results.
23
Laying the Groundwork:
An Example with Crude PAFs
24
Measures based on Risk Differences
Attributable Risk
Attributable Fraction
Population Attributable Risk
Population Attributable Fraction (PAF)
Overview of Attributable Risk Measures
20 pp
1
21
p
pp
21 pp
0
20
p
pp
OUTCOME Freq Row Pct Yes No total
Risk Factor Yes
a p1
b n1
Risk Factor No
c p2
d n2
Total m1 p0
m2 N
25
Overview of Attributable Risk MeasuresGeneral Interpretation
Attributable Risk: The risk of an outcome attributed to a given risk factor among those with that factor
Attributable Fraction: The proportion of cases of an outcome attributable to a risk factor in those with the given risk factor
Pop. Attributable Risk: The risk of an outcome attributed to a given risk factor in the population as a whole
Pop. Attributable Fraction (PAF): The proportion of cases of an outcome attributable to a risk factor in the population as a whole
26
Overview of Attributable Risk Measures
Equivalent / Alternative Terminology
• Attributable Risk, Risk Difference • Attributable Fraction, Attributable Risk %
Attributable Proportion, Etiologic Fraction• Pop. Attributable Risk• Pop. Attributable Fraction, Population Attributable
Risk %, Etiologic Fraction, Attributable Risk
27
Overview of Attributable Risk MeasuresVarious Formulas For the Crude PAF
DP
E|DPDP
11Total
#
1Total
#
Risk Relativeexposed of
Risk Relativeexposed of
RiskRelative
1 RiskRelative
Cases Total
cases exposed of #
0
20
p
pp
1)1RR(*)E(P
)1RR(*)E(PcrudePAF
OUTCOME Freq Row Pct Yes No total
Risk Factor Yes
a p1
b n1
Risk Factor No
c p2
d n2
Total m1 p0
m2 N
28
Example: Smoking and Low Birthweight
Crude RR = 10.00 = 1.60
6.25
Freq LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| smoke| 200 | 1800 | 2000 yes| 10.00 | 90.00 | _______|________|________| smoke| 500 | 7500 | 8000 no| 6.25 | 93.75 | _______|________|________| Total 700 9300 10000
PAR % =200
700
1.6 -1
1.6
100
10 7%. 107.06.1
16.1
700
200 PAFCrude
29
Example: Smoking and Low BirthweightCrude AssociationInterpretation of the RR v. the PAF
Women who smoke are at 1.6 times the risk of delivering a LBW infant compared to women who do not smoke.
10.7% of LBW births can be attributed to smoking. If smoking were eliminated, we would expect 75 fewer LBW births and the LBW rate would be reduced from 7% to 6.25%
30
Example: Cocaine and Low BirthweightCrude Association
Crude RR = 30.00 = 4.77
6.29
Freq LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| cocaine| 90 | 210 | 300 yes| 30.00 | 70.00 | _______|________|________| cocaine| 610 | 9090 | 9700 no| 6.29 | 93.71 | _______|________|________| Total 700 9300 10000 102.0
77.4
177.4
700
90 PAFCrude
31
Example: Cocaine and Low BirthweightCrude AssociationInterpretation of the RR v. the PAF
Women who use cocaine are at 4.77 times the risk of delivering a LBW infant compared to women who do not use cocaine.
10.2% of LBW births can be attributed to cocaine use. If cocaine use were eliminated, we would expect 71 fewer LBW births and the LBW rate would thus be reduced from 7% to 6.29%
32
Smoking and Low BirthweightCocaine and Low Birthweight
RR Compared to PAF
Notice that although the relative risk for the association between cocaine and low birthweight is much greater than that for smoking and low birthweight, the PAF for each is quite similar—10.7 for smoking and 10.2 for cocaine.
33
Moving Beyond Crude PAFs
Multivariable Approaches:
Organizing Multiple Factors
into a Risk System
34
PAFs Based on Organizing Multiple Factors into a Risk System Summary PAF: The total PAF for many modifiable
factors considered in a single risk system Component PAF: The separate PAF for each
unique combination of exposure levels in a risk system
“Adjusted” PAF: The PAF for eliminating a risk factor first from a risk system
Sequential PAF: The PAF for eliminating a risk factor in a particular order from a risk system; sets of sequential PAFs comprise possible removal sequences
Average PAF: The PAF summarizing all possible sequences for eliminating a risk factor
35
Extension of Basic Formulas for Multifactorial PAFs
= =
Rothman Bruzzi
– k=Number of unique exposure categories created with a complete cross-classification of independent variables
– pj=proportion of total cases that are in the “jth” unique exposure category
– RRj=Relative risk for the “jth” exposure level compared with the common reference group
Important: Note that in these formulas, the pjs are column percents
RiskRelative
1 RiskRelative
Cases Total
cases exposed of #
k
0j j
j
j RR
1RRp
k
0j j
j
RR
p1
36
The Simple Case of 2 Binary Variables
Organization into a Risk system
10
11
12
13
m
gp
m
ep
m
cp
m
ap
OUTCOME Freq Row Pct Yes No total
Risk Factor 1 and 2
a p3
b n3
Only Risk Factor 1
c p2
d n2
Only Risk Factor 2
e p1
f n1
Neither (Reference)
g p0
h n0
Total m1 pTotal
m2 N
0
00
0
11
0
22
0
33
p
pRR
p
pRR
p
pRR
p
pRR
37
Equivalence of the Rothman and Bruzzi Formulas
1
1
2
2
3
3
1
1
2
2
3
3
11
1
22
2
33
3
1
1
2
2
3
3
1231
1
2
2
3
3
123
1231
1
2
2
3
3
01230
0
1
1
2
2
3
3
RR
RR
RR
RR
RR
RR
RR
RR
RR
RR
RR
RR
RRRR
RR
RRRR
RR
RRRR
RR
RR
RR
RR
RR
RR
RR
RRRRRRRR
RR
RR
RR
RR
RR
RRRRRR
1RRRRRR
1
1
RR
RR
RR
RR
RR
RR
RRRRRRRRRR
RR
RR
RR
RR
RR
RR
RR
1p
1p
1p
1p
1p
1p
pp
pp
pp0
1p
1p
1p
pppppp
pppppp
pppp1
1p
1p
1p
1p
pppp1
1p
1p
1p
1p
123123
11
22
33123
123123
123123
01230123
01230123
k
0j j
j
j RR
1RRp
k
0j j
j
RR
p1
38
The simple case of 2 binary variablesSmoking and Cocaine
Crude RR = 1.60 Crude RR = 4.77
107.06.1
16.1
700
200 PAFCrude
Freq LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| cocaine| 90 | 210 | 300 yes| 30.00 | 70.00 | _______|________|________| cocaine| 610 | 9090 | 9700 no| 6.29 | 93.71 | _______|________|________| Total 700 9300 10000
102.077.4
177.4
700
90 PAFCrude
Freq LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| smoke| 200 | 1800 | 2000 yes| 10.00 | 90.00 | _______|________|________| smoke| 500 | 7500 | 8000 no| 6.25 | 93.75 | _______|________|________| Total 700 9300 10000
PAR % =200
700
1.6 -1
1.6
100
10 7%.
39
Smoking and Cocaine Organized into a Risk SystemIf smoking and cocaine use were recoded as a single “substance use” variable: Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total __________|________|________| any smoke| 238 | 1912 | 2150 or cocaine| 11.07 | 88.93 | __________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | __________|________|________| Total 700 9300 10000
16.088.1
188.1
700
238
PAF Summary
40
Components of each
combination of
risk factors in the
smoking-cocaine
risk system:
pj* rpj* RRj
*pj = column %
**rpj = row %
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
66.0700
462p
21.0700
148p
054.0700
38p
074.0700
52p
0
1
2
3
059.07850
462rp
08.01850
148rp
253.0150
38rp
347.0150
52rp
0
1
2
3
1059.0
059.0RR
36.1059.0
08.0RR
30.4059.0
253.0RR
89.5059.0
347.0RR
0
1
2
3
41
Component PAFs and Summary PAF for the Smoking-Cocaine Risk System
Using Rothman’s formula:
The Summary PAF is the
sum of component PAFs
+ +
+ = 0.16
062.089.5
189.5
700
52
3PAF
042.03.4
130.4
700
38
2PAF
056.036.1
136.1
700
148
1PAF
k
i j
j
j RR
RRp
0
1
0.01
11
700
462
0PAF
42
Component PAFs and Summary PAF for the Smoking-Cocaine Risk SystemUsing Bruzzi’s formula:
With Bruzzi’s formula, the
Summary PAF is not built
from component PAFs
k
0j j
j
RR
p1
16.0
8396.01
66.01544.00126.00126.01
1
66.0
36.1
21.0
30.4
054.0
89.5
074.01
43
Limitation of Component PAFs from the Smoking-Cocaine Risk SystemWhile the component PAFs of a risk system sum to the Summary PAF for the system as a whole, they do not provide mutually exclusive measures of the PAF for each risk factor
Here, the Summary PAF = 0.16,but the two factors overlap:the component PAFs still do not disentangle smoking and cocainefor those who do both
0.0620.056
0.042
0.84
44
The “Adjusted” PAF: Obtaining a Single PAF for a Given Risk FactorThe Stratified Approach: The PAF for eliminating a
risk factor after controlling for other risk factors
With the Rothman formula, data are organized into the more traditional strata set-up for adjustment:
Not assuming homogeneity, pj & RRj are stratum-specific:
Assuming homogeneity, Overall
strata of #
j
jj
Risk Relative
Risk Relative
strata all Cases, Total
cases exposed of
j
1#
Risk RelativeAdjusted
1 Risk RelativeAdjusted
strata all Cases, Total
strata all cases, exposed of #
45
The “Adjusted” PAF: Obtaining a Single PAF for a Given Factor
The Stratified Approach
If there is multiplicative effect modification
in the RR...
As usual, it is inappropriate to average widely varying stratum-specific RRs, say 3.0 and 0.90, because a single average would misrepresent the magnitude of the association, and sometimes, as in this example, misrepresent the direction of the association as well.
46
The “Adjusted” PAF: Obtaining a Single PAF for a Given Factor
The Stratified Approach
If there is not multiplicative effect modification
in the RR...
If there is no evidence of multiplicative effect modification and sample size permits, there is really nothing to be gained by not using stratum-specific estimates. Whichever formula is used, the result is a single “adjusted” PAF.
47
The “Adjusted” PAF: Obtaining a Single PAF for a Given Factor
Reorganizing the data to
get an adjusted PAF with
Rothman’s formula
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
COCAINE=YES Freq | LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| smoke| 52 | 98 |150 yes| 34.67 | 65.33 | _______|________|________| smoke| 38 | 112 |150 no| 25.33 | 74.67 | _______|________|________| Total 90 210 300
COCAINE=NO Freq| LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| smoke| 148 | 1702 |1850 yes| 8.00 | 92.00 | _______|________|________| smoke| 462 | 7388 |7850 no| 5.89 | 94.11 | _______|________|________| Total 610 9090 9700
48
The “Adjusted” PAF: The PAF for Smoking, Controlling for Cocaine Use*
RR=1.37 +
=
RR=1.36
*Using stratum-specific estimates
COCAINE=YES Freq | LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| smoke| 52 | 98 |150 yes| 34.67 | 65.33 | _______|________|________| smoke| 38 | 112 |150 no| 25.33 | 74.67 | _______|________|________| Total 90 210 300
COCAINE=NO Freq| LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| smoke| 148 | 1702 |1850 yes| 8.00 | 92.00 | _______|________|________| smoke| 462 | 7388 |7850 no| 5.89 | 94.11 | _______|________|________| Total 610 9090 9700
056.036.1
136.1
700
148PAF
020.037.1
137.1
700
52PAF
076.0
056.0020.0
Adjusted""PAF
49
The “Adjusted” PAF: The PAF for Cocaine Controlling for Smoking*
RR=4.33 +
=
RR=4.30
*Using stratum-specific estimates
SMOKE=YES Freq | LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| cocaine| 52 | 98 | 150 yes| 34.67 | 65.33 | _______|________|________| cocaine| 148 | 1702 |1850 no| 8.00 | 92.00 | _______|________|________| Total 200 1800 2000
SMOKE=NO Freq | LOW BIRTHWEIGHT Row Pct| yes | no |Total _______|________|________| cocaine| 38 | 112 | 150 yes| 25.33 | 74.67 | _______|________|________| cocaine| 462 | 7388 |7850 no| 5.89 | 94.11 | _______|________|________| Total 500 7500 8000
057.033.4
133.4
700
52PAF
042.030.4
130.4
700
38PAF
099.0
042.0057.0
Adjusted""PAF
50
The “Adjusted” PAF: Obtaining a Single PAF for a Given Risk FactorUsing the Bruzzi formula, the “strata” are defined as each row of the risk system. In the smoking-cocaine risk system, then, there are 4 “strata”.
For the PAF for smoking, controlling for cocaine use,the 4 ps are the 4 columnpercents and the 4 RRs are:
rp1/rp2 rp2/rp2 rp3/rp4 rp4/rp4
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
51
For the Burzzi Formula: the RRj* and RRj~
RR=1.37 RR=1
RR=1.36 RR=1
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
Freq | LOW BIRTHWEIGHT Row Pct | yes | no |Total _________|________|________| smoke and| 52 | 98 | 150 cocaine| 34.67 | 65.33 | _________|________|________| cocaine| 38 | 112 | 150 only| 25.33 | 74.67 | _________|________|________| smoke| 148 | 1702 | 1850 only| 8.00 | 92.00 | _________|________|________| neither| 462 | 7388 | 7850 | 5.89 | 94.11 | _________|________|________| Total 700 9300 10000
RR = 5.89
062.089.5
189.5
700
52PAF
RR = 4.30
042.03.4
130.4
700
38PAF
RR = 1.36
056.036.1
136.1
700
148PAF
52
The “Adjusted” PAF: Obtaining a Single PAF for a Given Risk FactorIn the Bruzzi approach to “adjustment”, there are 3 different
versions of the relative risks:
RRj = the component RRs
RRj* = the RRs for combinations of covariates in the absence of the factor being 'adjusted‘—in this simple example, these are the 2 RRs not involving smoking
RRj~ = the RRs for the factor being 'adjusted' conditioned on combinations of the covariates—in this simple example, these are the 2 RRs for smoking in the presence and absence (conditioned) on cocaine use. These are the “stratum-specific” RRs in the classic stratified set-up
53
The “Adjusted” PAF: Obtaining a Single PAF for a Given Risk FactorUsing the Bruzzi method:
PAF for Smoking,
controlling for cocaine use.
PAF for cocaine,
controlling for smoking.
078.0
922.01
66.01544.0054.0054.01
1
66.0
36.1
21.0
1
054.0
37.1
074.01
10.0
90.01
66.021.0126.0017.01
1
66.0
1
21.0
3.4
054.0
33.4
074.01
54
The “Adjusted” PAF Obtaining a Single PAF for a Given Factor
The Stratified Approach
Notice that controlling for confounding typically reduces the PAF, just as it typically reduces the relative risk or odds ratio.
Crude v. “Adjusted” PAF for smoking:
0.107 v. 0.076
Crude v. “Adjusted” for cocaine:
0.102 v. 0.099
55
Limitations of the “Adjusted” PAF:While adjustment methods control for other risk factors, the resulting adjusted PAFs still are not mutually exclusive and they do not meet the criterion of summing to the Summary PAF for all factors combined
≠
0.042+0.062+0.056=0.16 0.076 + 0.099 = 0.175
0.825
0.0760.099
0.0620.056
0.042
0.84
56
Limitations of the “Adjusted” PAF:Adjustment procedures result in a PAF that taken by itself represents an estimate—perhaps unrealistic—of the impact of eliminating one exposure first in a risk system, controlling for other factors, but not considering that some of those other factors may also be eliminated.
The “adjusted” PAF becomes more useful when it is considered as one element of a set of possible sequences for addressing all of the risk factors in a risk system—HOLD THIS THOUGHT
57
Extension to the Case of 3 Binary Variables
Example: SAS Code for reformatting individual-level data for the outcome and risk factors of interest into k observations
proc sort data=work.Orig_SampleLBW;by lbw smoke cocaine poverty;
run;
proc freq data=work.Orig_SampleLBW; tables lbw*smoke*cocaine*poverty/list;
run;
58
Extension to the Case of 3 Binary Variables
LBW by
Smoking,
Cocaine use
and Poverty
lbw smoke cocaine poverty Freq Cumulative Frequency
yes yes yes yes 24 24
yes yes yes no 28 52
yes yes no yes 80 132
yes yes no no 68 200
yes no yes yes 19 219
yes no yes no 19 238
yes no no yes 287 525
yes no no no 175 700
no yes yes yes 35 735
no yes yes no 63 798
no yes no yes 775 1573
no yes no no 927 2500
no no yes yes 50 2550
no no yes no 62 2612
no no no yes 2958 5570
no no no no 4430 10000
59
Extension to the Case of 3 Binary Variables
smoke cocaine poverty Casesj Controlsj Totalj
yes yes yes 24 35 59
yes yes no 28 63 91
yes no yes 80 775 855
yes no no 68 927 995
no yes yes 19 50 69
no yes no 19 62 81
no no yes 287 2958 3245
no no no 175 4430 4605
Data rearranged into “strata” in the Bruzzi sense...
60
Component Prevalences and Relative Risks for a Risk System with Three Variables
smoke cocaine poverty Casesi Controlsi Totali Pj RRj
yes yes yes 24 35 59 0.034 10.70
yes yes no 28 63 91 0.040 8.10
yes no yes 80 775 855 0.114 2.46
yes no no 68 927 995 0.097 1.80
no yes yes 19 50 69 0.027 7.25
no yes no 19 62 81 0.027 6.17
no no yes 287 2958 3245 0.410 2.33
no no no 175 4430 4605 0.250 1.00
Prevalence and RR added
Example (first row):pj = 24 / 700 = 0.034 RRj = [(24/59) / (175/4605)] = 10.70
61
Unique Cross-Classifications of
n VariablesFor binary variables, the # of strata k = 2n, where n=# variables
Example:
Smoke (1=Yes, 0=No),
Cocaine (1=Yes, 0=No),
Poverty(1=Yes, 0=No)
In general, K = the product of
the # of levels for each variable;
e.g. in Bruzzi, et al (1985):
k = 2*3*3*4 = 72 k = 23 = 8
smoke cocaine poverty
yes yes yes
yes yes no
yes no yes
yes no no
no yes yes
no yes no
no no yes
no no no
62
Component PAFs for Entire Risk System
0.0430.035
0.068
0.2340.543
0.023
0.0230.031
Smoke Alone
Smoke and Coke
Smoke and Poverty
Poverty Alone
Coke and Poverty
Coke Alone
All exposures
Unknown
Summary PAF = 0.46
63
Summary and Adjusted PAFs for a 3 Factor Risk System
Discuss Worksheet A in Supplementary Excel File Component, Adjusted and Summary PAF
calculations for smoke, cocaine, and poverty
64
Using Modeling to Compute Summary and “Adjusted” PAFs
Advantages of Modeling for Obtaining Intermediate Estimates for PAFs—as usual in comparison to
stratified methods
Modeling is not as sensitive to sparse data in individual cells when there are many strata
If you choose to consider confounding and effect modification in the same model, estimates are generated more easily
Note: Using an assumption-free approach, all variables are treated as effect modifiers (but this method breaks down quickly as there are more variables in the risk system)
65
Assumption-Free Approach Using FullySpecified Model/*Binomial Regression – Directly estimate RRs*/
proc genmod data=LBW desc; model lbw=smoke cocaine poverty smoke*cocaine smoke*poverty cocaine*poverty smoke*cocaine*poverty/dist=bin link=log;weight freq; run;
/*Logistic Regression – ORs as estimates of RRs*/proc logistic data=LBW desc; model lbw=smoke cocaine poverty smoke*cocaine smoke*poverty cocaine*poverty smoke*cocaine*poverty;weight freq;run;
66
Results from Fully-Specified Binomial Regression Model
PROC GENMOD is modeling the probability that lbw='1'.
Response Profile
OrderedValue
lbw
TotalFrequency
1 1 700
2 0 9300
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 8 4816.8235 602.1029
Scaled Deviance 8 4816.8235 602.1029
Pearson Chi-Square 8 9999.9937 1249.9992
Scaled Pearson X2 8 9999.9937 1249.9992
Log Likelihood -2408.4117
67
Results from Fully-Specified Binomial Regression Model
Analysis of Parameter Estimates
Parameter DF Estimate Standard Error
Chi-Square
Pr >ChiSq
Intercept 1 -3.2701 0.0741 1945.31 <.0001
smoke 1 0.5869 0.1386 17.94 <.0001
cocaine 1 1.8201 0.2140 72.36 <.0001
poverty 1 0.8447 0.0931 82.27 <.0001
smoke*cocaine 1 -0.3155 0.2902 1.18 0.2769
smoke*poverty 1 -0.5306 0.1836 8.35 0.0039
cocaine*poverty 1 -0.6844 0.2951 5.38 0.0204
smoke*cocaine*poverty 1 0.6494 0.4020 2.61 0.1062
Scale 0 1.0000 0.0000
68
Component and Summary PAFs from Fully-specified ModelDiscuss Worksheet B in Supplementary Excel File:
Summary PAFs from Fully Specified Models
69
Re-examining Fully-Specified Model
Non-significant interaction terms could be dropped from model
Analysis of Parameter Estimates
Parameter DF Estimate Standard Error
Chi-Square
Pr >ChiSq
Intercept 1 -3.2701 0.0741 1945.31 <.0001
smoke 1 0.5869 0.1386 17.94 <.0001
cocaine 1 1.8201 0.2140 72.36 <.0001
poverty 1 0.8447 0.0931 82.27 <.0001
smoke*cocaine 1 -0.3155 0.2902 1.18 0.2769
smoke*poverty 1 -0.5306 0.1836 8.35 0.0039
cocaine*poverty 1 -0.6844 0.2951 5.38 0.0204
smoke*cocaine*poverty
1 0.6494 0.4020 2.61 0.1062
70
Reduced Model
Analysis of Parameter Estimates
Parameter DF Estimate Standard Error
Chi-Square Pr > ChiSq
Intercept 1 -3.2506 0.0712 2083.76 <.0001
smoke 1 0.5169 0.1251 17.08 <.0001
cocaine 1 1.6369 0.1482 121.99 <.0001
poverty 1 0.8111 0.0903 80.60 <.0001
smoke*poverty 1 -0.3981 0.1643 5.87 0.0154
cocaine*poverty 1 -0.3407 0.2012 2.87 0.0905
Non-significant interaction term could be dropped from model
proc genmod data=LBW desc; model lbw=smoke cocaine poverty smoke*poverty cocaine*poverty/dist=bin link=log;weight freq; run;
71
Final Model
Analysis of Parameter Estimates
Parameter DF Estimate Standard Error
Chi-Square Pr > ChiSq
Intercept 1 -3.2341 0.0704 2110.99 <.0001
smoke 1 0.5741 0.1203 22.79 <.0001
cocaine 1 1.4372 0.0980 214.96 <.0001
poverty 1 0.7787 0.0884 77.69 <.0001
smoke*poverty 1 -0.4778 0.1568 9.28 0.0023
proc genmod data=LBW desc; model lbw=smoke cocaine poverty smoke*poverty / dist=bin link=log;weight freq; run;
72
Component and Summary PAFs from Final Reduced Model ModelDiscuss Worksheet C in Supplementary Excel File:
Summary PAFs from Final Reduced Models
Day One: 1:00-3:15
Exercise 1 Discussion of Exercise 1
Day One: 3:15-5:00
Overview of Sequential and Average PAFs: Example with 2 modifiable risk factors
Case study with 3 factors:
-2 modifiable factors, 1 unmodifiable factor
-3 modifiable factors Introduction of Exercise 2
75
Sequential PAFs (PAFSEQ) for theSmoking-Cocaine Risk SystemFor the smoking-cocaine risk system, there are 2 possible sequences:
1. Eliminate smoking first, controlling for cocaine use, then eliminate cocaine use
2. Eliminate cocaine use first, controlling for smoking, then eliminate smoking
And within each sequence, there are two sequential PAFs
76
Sequential PAFs (PAFSEQ) for theSmoking-Cocaine Risk System1.The PAFSEQ for eliminating smoking, controlling
for cocaine use:
PAFSEQ1a (S|C) = 0.076
2.The PAFSEQ for eliminating cocaine use after smoking has already been eliminated is the remainder of the Summary PAF
PAFSEQ1b =
PAFSUM – PAFSEQ1a (S|C) = 0.16 – 0.076 = 0.084
77
Sequential PAFs (PAFSEQ) for theSmoking-Cocaine Risk System1. The PAFSEQ for eliminating cocaine use,
controlling for smoking:
PAFSEQ2a (C|S) = 0.099
2. The PAFSEQ for eliminating smoking after cocaine use has already been eliminated is the remainder of the Summary PAF
PAFSEQ2b =
PAFSUM – PAFSEQ2a (C|S) = 0.16 – 0.099 = 0.061
78
Sequential PAFs (PAFSEQ) for theSmoking-Cocaine Risk SystemBy definition, the sequential PAFs within the two possible sequences sum to the Summary PAF
Smoking First Cocaine Use First
0.076 + 0.084 = 0.16 0.099 + 0.061 = 0.16
0.84
0.076
0.084
0.84
0.099
0.061
79
Average PAF (PAFAVG) for theSmoking-Cocaine Risk SystemWhile the sequential PAFs for each sequence sum to the Summary PAF, they still do not provide a overall comparison of the impact of smoking and cocaine use regardless of the order in which they are eliminated
That is, regardless of when cocaine might be eliminated, what would the impact of eliminating smoking be on average?
80
Average PAF (PAFAVG) for theSmoking-Cocaine Risk SystemTo calculate an average, the sequential PAFs are rearranged, grouping the two for smoking together and the two for cocaine together:
1. Eliminating smoking first, averaged with eliminating smoking second
2. Eliminating cocaine use first, averaged with eliminating cocaine use second
81
Average PAF (PAFAVG) for theSmoking-Cocaine Risk System
Averaging Sequential PAFs
Average PAF for Smoking:
=
Average PAF for Cocaine Use:
=
2
PAF C|SPAF SUMSEQ S|CPAFSEQ 07.0
2
0.0610.076
09.02
0.0840.099
2
PAF S|CPAF SUMSEQ C|SPAFSEQ
82
Average PAFs for theSmoking-Cocaine Risk System
The Average PAFs for each factor in the risk system are mutually exclusive and their sum equals the Summary PAF:
0.0685 + 0.0915 = 0.16
0.09
0.07
0.84
83
Case Study: Example with Three Factors Scenario: You are asked to prioritize spending
for interventions that target the high rate of low birth weight (LBW) in your jurisdiction.
Data: You have a data set with relatively reliable data on smoking during pregnancy, cocaine use during pregnancy and poverty level.
Method: You would like to use one of the methods you just learned for calculating the PAFs for each of these factors.
84
Modifiable and Unmodifiable Risk Factors
Using a Modeling Approach
Within one model, we can differentiate between those factors considered to be modifiable and those factors considered to be unmodifiable
While this does not change the model, this differentiation has an impact on the resulting summary, sequential, and average PAFs due to how relative risks are calculated
85
Decisions for PAF Analysis
Would you consider each of the following variables unmodifiable or modifiable for preventing LBW? Smoking (1=Smoking during pregnancy, 0=No smoking) Cocaine (1=Cocaine use during pregnancy, 0=No cocaine) Poverty (1=Below Federal Poverty Level, 0=Above FPL)
What type(s) of PAF is/are most appropriate? Adjusted (only focused on one factor, controlling for others) Sequential (specifying one ordering for targeting factors) Average (account for all possible sequences of eliminating
each factor)
86
Descriptive Statistics for Case Study Frequency and Percent of LBW by Each
Covariate
Frequency (Row %) Low Birth Weight
yes no Total
Smoke
yes 200 (10.0%) 1800 (90.0%) 2000
no 500 (6.2%) 7500 (93.8%) 8000
Cocaine
yes 90 (30.0%) 210 (70.0%) 300
no 610 (6.3%) 909 (93.7%) 9700
Poverty
yes 410 (9.7%) 3818 (90.3%) 4228
no 290 (5.0%) 5482 (95.0%) 5772
Total 700 9300 10000
87
Case Study Part I
Calculating Sequential and/or Average PAFs for Smoking and Cocaine Use
Considering Poverty as Unmodifiable
88
Sequential PAFs for theSmoking-Cocaine-Poverty Risk System, Considering Poverty as UnmodifiableWith 3 factors, but only 2 of them modifiable, there are 2 possible sequences:
1.Eliminate smoking first, controlling for cocaine use and poverty, then eliminate cocaine use
2.Eliminate cocaine use first, controlling for smoking and poverty, then eliminate smoking
And within each sequence, there are two sequential PAFs
89
SAS Code: Obtaining Prevalences and Beta Estimates for Smoke, Cocaine and Poverty/*Create a listing of the frequencies for each possible combination of smoke, coke, poverty for LBW cases to calculate proportions*/proc freq order=formatted; tables poverty*smoke*cocaine/list nopercent; where lbw=1;run;
/*Binomial regression to obtain RRs*/proc genmod;title2 “RRs for Smoke and Coke with LBW, controlling for Poverty";model lbw = smoke cocaine poverty smoke*poverty
/dist=bin link=log obstats; /*Binomial distribution*/
run;
90
Discuss Worksheets D and E in Supplementary
Excel File:
Calculations for 1st Sequential PAFs, Summary PAFs, and Average PAFs for Smoking and Cocaine,
Controlling for Poverty
Sequential PAFs for theSmoking-Cocaine-Poverty Risk System, Considering Poverty as Unmodifiable
91
PAFSEQ for Smoking and Cocaine,Considering Poverty as UnmodifiableSequence 1: Smoking, THEN Cocaine
PAFSEQ1a: (S | C U P)= 0.074
PAFSEQ1b : (C U S | P) – (S | C U P) =
0.156 – 0.074= 0.082
Sequence 2: Cocaine, THEN Smoking
PAFSEQ2a : (C | S U P) = 0.098
PAFSEQ2b: (S U C | P) – (C | S U P) =
0.156 - 0.098= 0.058
The Summary PAF includes only smoking and cocaine, since poverty is unmodifiable.
92
0.074
0.082
0.844
Seq PAF Smoke Seq PAF Coke Unknown 0.07
0.08
0.85
Seq PAF Smoke Seq PAF Coke Unknown
PAFSEQ2Smoking THEN
Cocaine, Controlling for
Poverty
Cocaine THEN Smoking,
Controlling for Poverty
PAFSUM=0.156
PAFAGG=0.15
0.058
0.098
0.844
Seq PAF Coke Seq PAF Smoke Unknown
PAFSUM=0.156
PAFSEQ for Smoking and Cocaine,Considering Poverty as Unmodifiable
93
Average PAF (PAFAVG)
Eide (1995): Based on Game Theory according to Cox’s Theorem (1984) for risk allocation (attributable risk among the exposed)
w
SEQAVG wiPAFn
PAF ,
!
1
Note: Average PAF is sometimes called the “partial” attributable fraction
where “n” is the number of modifiable risk factors in the risk system, “w” is the number of unique removal sequences for all variables in risksystem and “i” represents a specific variable in the system
,
94
Average PAFs for Smoking and Cocaine,Controlling for Poverty Average PAF for Smoking
PAFAVG: ((PAFSEQ1a+PAFSEQ2b)/2)
PAFAVG : ((0.074 + 0.058 ) / 2) = 0.066
Average PAF for Cocaine
PAFAVG: ((PAFSEQ1b+PAFSEQ2a)/2)
PAFAVG : ((0.098 + 0.082 ) / 2) = 0.090
95
Case Study Part II
Calculating Sequential and/or Average PAFs for Smoking, Cocaine Use, and Poverty
Considering Poverty as Modifiable
96
Sequential PAFs (PAFSEQ) for theSmoking-Cocaine Risk SystemFor the smoking-cocaine-poverty risk system, there are 6 possible sequences:
1. Smoking, cocaine use, poverty
2. Smoking, poverty, cocaine use
3. Cocaine use, smoking, poverty
4. Cocaine use, poverty, smoking
5. Poverty, smoking, cocaine use
6. Poverty, cocaine use, smoking
And within each sequence, there are three sequential PAFs
97
SAS Code: Obtaining Prevalences and Beta Estimates for Smoke, Cocaine and Poverty/*Create a listing of the frequencies for each possible
combination of smoke, coke, poverty for LBW cases to calculate proportions*/proc freq order=formatted; tables poverty*smoke*cocaine/list nopercent; where lbw=1;run;
/*Binomial regression to obtain RRs*/proc genmod;title2 “RRs for Smoke and Coke with LBW, controlling for Poverty";model lbw = smoke cocaine poverty smoke*poverty
/dist=bin link=log obstats; /*Binomial distribution*/
run;
98
Sequential PAFsQ: How many unique sequences will there be for removing risk factors from the risk system?
A: n!, where n=# of modifiable risk factors in system
Ex: n=3, n!= 3x2x1 = 6 unique sequences
n=4, n!= 4x3x2x1 = 24 unique sequencesn=5, n!= 5x4x3x2x1 = 120 unique sequencesetc…
To calculate the PAFSEQ for factors removed second and third in a 3 variable risk system, it is necessary to compute the PAF for every pair of two factors combined, adjusting for the third factor. These are intermediate Summary PAFs.
99
Discuss Worksheets F and G in Supplementary
Excel File:
Calculations for 1st Sequential PAFs, Summary PAFs, and Average PAFs for Smoking,
Cocaine, and Poverty
Sequential PAFs for theSmoking-Cocaine-Poverty Risk System, Considering Poverty as Modifiable
100
PAFSEQ for Smoking Removed FirstSequence 1: Smoking, THEN Cocaine, THEN Poverty
PAFSEQ1a: (S | C U P) = 0.074
PAFSEQ1b: (S U C | P) – (S | C U P) = 0.156 – 0.074 = 0.082
PAFSEQ1c: (S U C U P) – (S U C | P) = 0.441 – 0.156 = 0.286
Sequence 2: Smoking, THEN Poverty, THEN Cocaine
PAFSEQ2a: (S | P U C)= 0.074
PAFSEQ2b: (S U P | C) – (S | P U C) = 0.383 – 0.074 = 0.310
PAFSEQ2c: (S U P U C) – (S U P | C) = 0.441 – 0.383 = 0.058
101
PAFSEQ2Smoking THEN Cocaine, THEN
Poverty
Smoking THEN Poverty, THEN
Cocaine
0.559
0.074
0.082
0.286
Seq PAF Smoke Seq PAF Coke Seq PAF Poverty Unknown
0.559
0.31
0.074
0.058
Seq PAF Smoke Seq PAF Poverty Seq PAF Coke Unknown
0.10
0.05
0.54
0.31
Seq PAF Coke Seq PAF Smoke Seq PAF Poverty Unknown
PAFSEQ for Smoking Removed First
102
PAFSEQ for Cocaine Removed FirstSequence 3: Cocaine, THEN Smoking, THEN Poverty
PAFSEQ3a: (C | S U P)= 0.098
PAFSEQ3b: (C U S | P) – (C | S U P) = 0.156 – 0.098 = 0.058
PAFSEQ3c: (C U S U P) – (C U S| P) = 0.441 – 0.156 = 0.286
Sequence 4: Cocaine, THEN Poverty, THEN Smoking
PAFSEQ4a : (C | P U S)= 0.098
PAFSEQ4b: (C U P | S) – (C | P U S) = 0.355 – 0.098 = 0.257
PAFSEQ4c: (C U P U S) – (C U P | S) = 0.441 – 0.355 = 0.086
103
PAFSEQ for Cocaine Removed First
0.559
0.098
0.058
0.286
Seq PAF Coke Seq PAF Smoke Seq PAF Poverty Unknown
PAFSEQ2Cocaine THEN
Smoking, THEN Poverty
Cocaine THEN Poverty, THEN
Smoking
0.10
0.05
0.54
0.31
Seq PAF Coke Seq PAF Smoke Seq PAF Poverty Unknown
0.257
0.098
0.559
0.086
Seq PAF Coke Seq PAF Poverty Seq PAF Smoke Unknown
104
PAFSEQ for Poverty Removed FirstSequence 5: Poverty, THEN Smoking, THEN Cocaine
PAFSEQ5a: (P | S U C) = 0.275
PAFSEQ5b: (P U S | C) – (P | S U C) = 0.383 – 0.275 = 0.108
PAFSEQ5c: (P U S U C) – (P U S | C) = 0.441 – 0.383 = 0.058
Sequence 6: Poverty, THEN Cocaine, THEN Smoking
PAFSEQ6a: (P | C U S)= 0.275
PAFSEQ6b: (P U C | S) – (P | C U S) = 0.355 – 0.275 = 0.080
PAFSEQ6c: (P U C U S) – (P U C | S) = 0.441 – 0.355 = 0.086
105
0.08
0.275
0.559
0.086
Seq PAF Poverty Seq PAF Coke Seq PAF Smoke Unknown
PAFSEQ for Poverty Removed First
PAFSEQ2Poverty THEN
Smoking, THEN Cocaine
Poverty THEN Cocaine THEN
Smoking
0.10
0.05
0.54
0.31
Seq PAF Coke Seq PAF Smoke Seq PAF Poverty Unknown
0.559
0.275
0.108
0.058
Seq PAF Poverty Seq PAF Smoke Seq PAF Coke Unknown
106
PAFAVG for Smoking, Cocaine and Poverty(6 Sequential PAFs in each Average, 4 are Unique)
Average PAF for SmokingPAFAVG =(PAFSEQ1a +PAFSEQ2a+PAFSEQ3b+PAFSEQ4c+PAFSEQ5b+PAFSEQ6c) / 6PAFAVG = (2(0.074) + 0.058 + 0.108 + 2(0.086)) / 6) = 0.081
Average PAF for CocainePAFAVG =(PAFSEQ1b +PAFSEQ2c+PAFSEQ3a+PAFSEQ4a+PAFSEQ5c+PAFSEQ6b) / 6 PAFAVG = (2(0.098)+0.082+0.080+2(0.058)) / 6 = 0.079
Average PAF for PovertyPAFAVG = (PAFSEQ1c+PAFSEQ2b+PAFSEQ3c+PAFSEQ4b+PAFSEQ5a+PAFSEQ6a) / 6 PAFAVG = (2(0.275)+0.310+0.257+2(0.286)) / 6 = 0.281
107
0.281
0.081
0.079
0.559
Avg PAF Smoke Avg PAF CokeAvg PAF Poverty Unknown
0.09
0.84
0.07
Average PAFs for all possible models
0.090
0.85
0.066
0.077
0.090
0.289
0.545
Avg PAF Smoke Avg PAF CokeAvg PAF Poverty Unknown
Smoke and Coke Smoke and Coke, Controlling for
Poverty
Smoke, Coke and Poverty
PAFSUM=0.16 PAFSUM=0.156 PAFSUM=0.441
108
0.0570.081
0.2550.607
Avg PAF Smoke Avg PAF CokeAvg PAF Poverty Unknown
0.092
0.840
0.068
Average PAFs for all possible models – with no interaction term for smoke*poverty
0.064
0.091
0.000
0.845
0.077
0.090
0.289
0.545
Avg PAF Smoke Avg PAF CokeAvg PAF Poverty Unknown
Smoke and Coke Smoke and Coke, Controlling for
Poverty
Smoke, Coke and Poverty
PAFSUM=0.160 PAFSUM=0.155 PAFSUM=0.393
109
0.067
0.912
0.021
0.123
0.757
0.120
Poverty = Yes Poverty = No
0.077
0.090
0.289
0.545
Avg PAF Smoke Avg PAF CokeAvg PAF Poverty Unknown
Average PAFs stratified by poverty
PAFSUM=0.088 PAFSUM=0.245
110
Introduction of Exercise 2
Day Two: 8:00-12:00
Exercise 2 and Discussion of Exercise 2 Brief Review Model Building Issues Exercise 3
112
Review
The Population Attributable Fraction (PAF) could be a useful tool to inform priority-setting and development of targeted interventions in public health since it estimates the potential impact of risk reduction in the population on the occurrence of a health outcome
The PAF incorporates both a measure of association between a risk factor and an outcome and the prevalence of the risk factor in the population as a whole.
113
Review The Summary PAF is the proportion of an
outcome that could be reduced bysimultaneously eliminating from the population all modifiable factors in a risk system.
The Summary PAF can be partitioned into: Component PAFs Sequential PAFs corresponding to a particular
removal sequence Average PAFs
The modifiable factors in the risk system can be “adjusted” both for each other and for other unmodifiable factors
114
Review
Partitioning of the Summary PAF
for a Risk System
Component PAFs Sequential PAFs for Average PAFs
One Possible Sequence
115
Review The component PAFs reflect every combination of the
modifiable factors in the risk system and do not yield any factor-specific PAF
Sequential PAFs yield factor-specific PAFs, but these factor-specific PAFs vary across the possible removal sequences; the first sequential PAF in any sequence is what is commonly called the “adjusted” PAF
Component PAFs and Sequential PAFs for a given sequence are not mutually exclusive estimates of the impact of eliminating modifiable factors regardless of whether and when other modifiable factors are also eliminated.
116
ReviewThe number of possible sequences is a function of the number of variables in the risk system and becomes large quickly as the number of variables increases.
Number of Risk Factors
Number of Possible Removal Orderings / Sequences
Number of Unique Sequential PAFs
2 2! = 2 2
3 3! = 6 4
4 4! = 24 8
5 5! = 120 16
6 6! = 720 32
7 7! = 5,040 64
117
Review The number of average PAFs equals the number of
variables in a risk system.
Average PAFs, by considering every possible sequence, yield mutually exclusive estimates, making comparisons of the potential impact of risk reduction intervention strategies possible
The average PAF may be a better measure of impact than the first sequential (“adjusted”) PAF since typically there are multiple interventions operating simultaneously—risk reduction activities are unordered and often intersect
118
Review
Sequence X: Factor M1Mn, controlling for UM1UMz
PAFSEQXa: (M1| M2 U U Mn U UM1 U U UMz)
(“adjusted” PAF for M1)
PAFSEQXb: (M1 U M2 | M3 U U Mn U UM1 U U UMz)
– (M1| M2 U U Mn U UM1 U U UMz)
PAFSEQXn: M1 U U Mn | UM1 U U UMz)
– (M1 U U Mn-1 | Mn U UM1 U U UMz)
The 2nd, 3rd, to n-1th sequential PAFs are the remainders from intermediate Summary PAFs; the nth sequential PAF is the remainder from the total Summary PAF
119
Review
Computation of the sequential PAFs within particular removal sequences becomes cumbersome as the number of variables, both modifiable and unmodifiable increases
Intermediate Summary PAFs are required for differing subsets of modifiable variables in a risk system
120
Review
Whether computing crude, “adjusted”, summary, or sequential PAFs, and whether using a stratified or modeling approach, some form of either the Rothman or Bruzzi formulas can be used.
k
0j j
j
j RR
1RRp
k
0j j
j
RR
p1
121
Model Building Issues and Strategies
in the Context of Estimating PAFs
Reporting PAFs
122
Model Building Issues and Strategies Within one model, we can differentiate between those factors considered to be modifiable and those factors considered to be unmodifiable
The differentiation between modifiable and unmodifiable variables may change the final model since this differentiation has an impact on decisions as to whether the variable is included in a final model
In addition, the resulting summary, sequential, and average PAFs will vary depending on which variables are designated as modifiable because of how relative risks are calculated
123
Model Building Issues and Strategies
Variable Selection
Modifiability Unmodifiable factors are only used as potential
confounders or effect modifiers; PAFs not calculated Modifiable factors are factors that can possibly be
altered with clear intervention strategies
Being in the pool of modifiable factors not only influences final PAF estimates, but also may change level of measurement, choice of reference level, and handling of confounding and effect modification
124
Model Building Issues and StrategiesDifferential handling of
unmodifiable and modifiable factors
Levels of measurement: Modifiable variables cannot be continuous Modifiable variables can be ordinal or nominal Sets of dummy variables can be used, but for
modifiable factors it means there will be a separate PAF for each dummy variable
Unmodifiable variables can be at any level of measurement, although if there is effect modification with a modifiable factor, recoding into categories will be necessary for continuous variables
125
Model Building Issues and Strategies
Choice of Reference Level for Comparison
Since PAFs quantify the impact of complete elimination of a risk factor, it may be more realistic to define reference groups that pull back from this maximum:
Some Examples: >= 2 days exercise, rather than >= 5 days exercise
<=1 medical risk factor rather than 0 medical risks
126
Model Building Issues and StrategiesReference Groups for Modifiable Factors
More restrictive level of the reference group could lead to both a higher prevalence of exposure and stronger measure of effect, resulting in an inflated PAF
Importance of distinguishing between never exposed and formerly exposed
Use conceptual framework and balance evidence with realistic goals
127
Model Building Issues and StrategiesEffect modification
– within modifiable factors—use either a product term or could use common reference coding to create a set of dummy variables
– across modifiable and unmodifiable factors—this might point to doing modeling stratified by the unmodifiable factor involved in the interaction; if the unmodifiable variable is continuous, it would have to be recoded into categories for stratification
– within unmodifiable factors—use a product term or ignore the interaction if it does not have an impact on the measures of association for the modifiable factors
128
Model Building Issues and Strategies
Parsimony is not as important when building a model as a step toward obtaining average PAFs; that is, variables with insignificant RRs / ORs may be included in a final model if the resulting PAFs based on them are meaningfully large.
129
Model Building Issues and StrategiesCriteria for selection of variables for a final model:
The prevalence estimates themselves might also be used in to inform decisions about which variables will stay in a model
Criteria for Modifiable Risk Factors
Staying in a Model
1st Sequential PAF
95% CI Does Not Include 0
95% CI Includes 0
Significant RR / OR
Close to the null ? ?Far from the null ? ?
Not SignificantRR / OR
Close to the null ? ?Far from the null ? ?
130
Model Building Issues and Strategies
For unmodifiable factors, statistical significance may be more important as it is one component of indicating the presence of confounding of the effects of the modifiable factors
The prevalence of the unmodifiable factors in the population is not of interest since they are not part of the risk system for which PAFs are being estimated
131
Model Building Issues and StrategiesPossible Model building strategies
Build models with one modifiable factor at a time plus the unmodifiable factors
Build models with subsets of modifiable factors that are within a domain (substantively related) plus the unmodifiable factors
Build models starting with all modifiable and unmodifiable factors, and then use a manual backward elimination approach
132
Moving from Modeling to Reporting of PAFs
For any model building strategy:
Choose final pool of modifiable factors based on the significance of the first sequential PAFs and 95% CIs, or some other explicitly decided upon criteria
Calculate average PAFs for all modifiable factors in the final model, but report only those with values above some threshold, e.g. 2%, 5%, 10%?
133
Moving from Modeling to Reporting of PAFs
Even with careful choice of reference levels, average PAFs are probably over-estimates of the expected reduction in an outcome since they assume that all of the factors in a risk system can be completely eliminated from the population
134
Moving from Modeling to Reporting of PAFs
Average PAFs can be refined by differentially weighting removal sequences to reflect issues such as funding streams or political will, since in reality not all removal sequences are equally likely, or by incorporating measures of uptake and efficacy of public health interventions
(this is beyond the scope of this training)
135
Moving from Modeling to Reporting of PAFsVariance estimates for Average PAFs need to be developed and then a consensus needs to be reached for the interpretation of resulting confidence intervals.
As always, narrower CIs will mean increased reliability
The CIS across multiple PAFs will undoubtedly overlap. What will this mean for informing the prioritization process across modifiable factors?
Will a CI with a lower bound < 1 mean a factor is not significant and therefore not a priority?
136
Presentation and Interpretation
137
0.035 0.0310.043
0.068
0.023 0.023Ref
0.234
0
0.05
0.1
0.15
0.2
0.25
0.3
Co
mp
on
ent
PA
F
Smoke=YCoke=Y
Smoke=YCoke=N
Smoke=NCoke=Y
Smoke=NCoke=N
Component PAFs for Smoke and Coke, Stratified by Poverty
Poverty=No
Poverty=Yes
Total PAF=0.457
138
Presentation of Sequential PAFs for the Smoke, Coke and Poverty Risk System
0.07
0.31
0.54
0.08
Seq PAF Smoke Seq PAF Poverty Seq PAF Coke Unknown
1
2
3
0.10
0.27
0.54
0.09
Seq PAF Coke Seq PAF Poverty Seq PAF Smoke Unknown
0.10
0.27
0.54
0.09
Seq PAF Coke Seq PAF Poverty Seq PAF Smoke Unknown
1
2
3
1
139
Interpretations of Sequential PAFs from the Smoke- Coke- Poverty Risk System PAFSEQ (smoking 3rd, after coke and poverty) =0.09
An additional nine percent of LBW cases can be attributed to smoking after cocaine use and poverty have already been eliminated from the population of pregnant women
The expectation is that an additional 63 cases (0.09*700) of LBW in this sample of pregnant women would have been prevented had smoking been eliminated from the population after the elimination of cocaine and poverty
140
Interpretation of Average PAFs from the Smoke, Coke, and Poverty Risk System PAFAVG (Smoking) = 0.06
On average, regardless of the order in which risk factors are removed from the risk system, the expectation is that six percent of LBW cases would be prevented if smoking is eliminated from the population, while also considering the impact of cocaine and poverty
PAFAVG (Cocaine) = 0.09
On average, nine percent of LBW cases would be prevented by the additional elimination of cocaine exposure from the population after a random collection of exposures has already been eliminated.
141
Presentation Issues to consider
Is there any time when displaying stratified PAFs would be appropriate?
Targeting an intervention to a particular risk group
Displaying the interaction effects between variables
Others?
142
Interpretation Issues to ConsiderPAF should not be mis-interpreted as the percent of
diseased who have the risk factor of interest or the percent of cases for which an identifiable risk factor can be found. Example: PAF for impact of 10 factors on breast
cancer=0.25.
Incorrect: Although various risk factors have been identified as causes of breast cancer, the fact remains that in 75% of all breast cancer no identifiable risk factor
can be found.
Incorrect: Only 25 percent of breast cancer cases can be attributed to one or more risk factors, meaning that the majority of cancers occur in women with no risk factors.
Rockhill, et al., 1998
143
Interpretation Issues to ConsiderRothman: With a PAF of 25%, the following interpretation is not completely true: 25% of disease would be reduced if X risk factor were eliminated.
1) Assumes all biases are absent
2) Assumes that absence of risk factor would not expand person-years at risk, which could subsequently lead to more cases (in the case of competing risks)
Rothman, & Greenland, 1998
144
Interpretation Issues to ConsiderRothman Example 1: PAF=0.25 for smoking in relation to coronary deaths.Elimination of smoking could lead to less lung cancer deaths,
which would lead to more people living long enough to die by coronary heart disease. Therefore, “25% fewer coronary deaths would have occurred had these doctors not smoked” is a little misleading.
Rothman Example 2: PAF=0.20 for spermicide in relation to Down’s syndromeElimination of spermicide use could lead to more pregnancies,
which would lead to more Down’s syndrome cases. Therefore, “20% fewer Down’s syndrome cases would have occurred had the couple not used spermicide” is a little misleading.
145
Exercise 3
Day Two: 1:00-2:00
Discuss Exercise 3
Day Two: 2:00-4:30
Interactive Model Building:
Demonstration and Exercise