local model uncertainty and incomplete-data bias
DESCRIPTION
Local model uncertainty and Incomplete-data bias. S. Eguchi, ISM & GUAS This talk was a part of co-work with J. Copas, University of Warwick. Hidden Bias. Publication bias - not all studies are reviewed. Confounding - causal effect only partly explained. - PowerPoint PPT PresentationTRANSCRIPT
1
Local model uncertainty and
Incomplete-data bias
S. Eguchi, ISM & GUAS
This talk was a part of co-work withJ. Copas, University of Warwick
2
Hidden Bias
Publication bias - not all studies are reviewed
Confounding - causal effect only partly explained
Measurement error - errors in measure of exposure
3
Lung cancer & passive smoking
1.00.50.3 1.5 2.0 3.0 4.0 5.0 10.0
stud
y
Odds ratio
5
10
15
20
25
30
4
Passive smoke and lung cancer
Log relative risk estimates (j =1,…,30) from 30 2x2tables j
)weighte varinancinverse theis( jj
jj ww
w
The estimated relative risk 1.24 with 95% confidence interval (1.13, 1.36)
Conventional analysis
1.00.50.3 1.5 2.0 3.0 4.0 5.0 10.0
stud
y
Odds ratio
5
10
15
20
25
30
1.24
6
Incomplete Data
z = (data on all studies, selection indicators)
y = (data on selected studies)
z = (response, treatment, potential confounders)
y = (response, treatment)
z = (disease status, true exposure, error)
y = (disease status, observed exposure)
y = h(z)
7
Level Sets of h(z)
1. One-to-one 2. Missing 3. Measurement error
4. Interval censor
5. Competing risk 6.Hidden confounder
8
Ignorable incompleteness
Let Y = h(Z) be a many-to-one mapping.
If Z has then Y has
)(1d),(),(
yhZY zzθy ff
Z is complete; Y is incomplete
),( θzZf
ZθZ onData ˆMLE
YθY onData ˆMLE
)ˆE()ˆ(E trueis ZYZ θθ f
)ˆE()ˆ(E wrongis ZYZ θθ f
9
Tubular Neighborhood
M
M
})),,((KLmin:),,({ 221
YYY gfg θθyΘθ
N
}:),({ θθyYfMModel
Copas, Eguchi (2001)
2
2
),(KL
MM
Near-model }:),,({ θθyY gM
10
Mis-specification
}{ ),(exp),(),,( θzθzθz ZZZ ufg
0 )(E,1)(,0)(E 2ZZZZ suuEu fff
21
}),KL(2{ ZZ fg
"direction cationmisspecifi"Zu
11
Near model
}{ ),(exp),(),,( θyθyθy YYY ufg
],|),([E),(where yθzθy ZθY uu
)(:By zhyzh
h),( θzZf ),( θyYfModel
Near-modelh
),,( θzZg ),,( θyYg
12
Asymtotic bias
)1( ||||max min22
|
b
Zu
ifonly and if holds bounds The
.),(),( θysθy YY u
)ˆE()ˆ(E ZY θθb
loss ninformatio of eigenvaluesmallest theismin
2/12/1 YZY III
13
From pure misspecification
Unbiased perturbed biased perturbedh
14
The worst case
),,,( ωθyYg
),( θyYf
),( ωθy YY If ).,(),( T θysωθy YY u
*)),(),,,,(KL(minarg*
θωθωθ YYΘθ
Y
fgI
15
Nonignorable missingness
rrTfrf 1)1(;(;,( ttZ
),(),( )( rr rtytz
0ifRI
1ifwhere )(
r
rp
r tt
}logvar{)1(|||| )|0()|1(212
trPtrP
g
gI
Yb
The model assumes MCAR or MARZf Yf
16
Potential confounder
,),(),,( xtycxtz
confounderexposure,responce,
where
cxt
),(~|: 2T xθxtY Nf
xccβxθcxtZ ,),|E(: TTf
),(cor)|,(cor|||| 2212 xcxctb YI
17
Problem in estimation of bias
The nonignorable model
}{ 22 )(),(exp),(),,( 2
1 θθyθyθy YYY ufg
gives the worst case if ).,(),( T θysωθy YY u
However is inestimable and untestable: ω
The profile likelihood
n
iigPL
1
)},,,({logmax),( ωθyω YΘθ
Y
is flat at 0ω
18
Heckman model for MNAR
))(()1( T1T, xβxψX|R yrg Y xβTy
),,()(),,( )( rthrt r xzxz
19
Sensitivity analysis
}{ TT
2
1),(exp),(),,( ωωθysωθyωθy YYY YIfg
θ
θysωθysωθy
θY
YY
),(
),()},,(log{ 2/1TYIg
The most sensitive model
Estimating function of with fixed
Yθ̂}const.:ˆ{ T
, ωωθ Yω IY
20
Scenarios A, B, C
Inference from using fY nyy ...,,1
}.)()ˆ()ˆ(:{)(C 2rkIk T YYY θθθθθ
Scenario A: 10 Ak
Scenario C: 1unknown0 Ck
Scenario B:
acceptable
n
found had and
,..., observed had weif,0 1 zz
CBA kkk
21
Scenarios A and C
),0(~)ˆ(2/1 INI fYθθYY
}.)()ˆ()ˆ(:{)(C 2rkIk T YYY θθθθθ
Scenario A: 0
Scenario C:
),0(~)ˆ(
unknown02/1 INI gY
bθθYY
?!)(,1 AA kCk
?!)(,1 22CC kCk
22
Scenario B
andˆ MLEhave couldwe,,..., observe could weIf 1 Zθzz n
),0(~)ˆˆ()( 2/12/1 INII fYZYY θθU
),0(~|)(* INgYUSUS
)}ˆˆ()ˆ{()ˆ( 2/12/1ZYYZZZ θθθθθθS II
Conditional confidence interval
}||)(||:{)( 22*rC uSθu
23
Theorem
}.)()ˆ()ˆ(:{)(CLet 2rkIk T YYY θθθθθ
).2()()1( Then22||||
CCCr
uu
-1.5 -1 -0.5 0.5 1 1.5
-1.5
-1
-0.5
0.5
1
1.5
-1.5 -1 -0.5 0.5 1 1.5
-1.5
-1
-0.5
0.5
1
1.5
-1.5 -1 -0.5 0.5 1 1.5
-1.5
-1
-0.5
0.5
1
1.5
24
Risk from passive smoke
25
Passive smoke and lung cancer
The estimated relative risk 1.24 with 95% confidence interval (1.13, 1.36)
Square root rule 95% confidence interval (1.08, 1.41)
Root-2-rule
1.00.50.3 1.5 2.0 3.0 4.0 5.0 10.0
stud
y
Odds ratio
5
10
15
20
25
30
1.24
27
Present and Future
Does all this matter?
Statistics ( missing data, response bias, censoring)
Biostatistics (drop-outs, compliance)
Epidemiology ( confounding, measurement error)
Econometrics (identifiability, instruments)
Psychometrics (publication bias, SEM)
causality, counter-factuals, ...