Page 1
The multiple faces of shrinkageGeorg HeinzeCenter for Medical Statistics, Informatics and Intelligent SystemsSection for Clinical [email protected]
Partly supported by Austrian Science Fund FWF, Project I2276-N33
Page 2
The multiple faces of shrinkage
•
•
Dunkler, Sauerbrei and Heinze, JStatSoft 2016
•
Puhr, Heinze, Nold, Lusa and Geroldinger, StatMed 2017
Page 3
Historical outline
•
•
•
•
•
•
•
•
•
Page 4
Purposes of shrinkage estimators
•
•
•
•
Page 5
Post-estimation shrinkage methodsJoint work with Michael Kammer, Daniela Dunkler, Willi Sauerbrei
Page 10
Post-estimation shrinkage methods
•
• 𝛽
•
•
• 𝑏
• 𝛽 𝛽(−𝑖)
• 𝜂𝑖 = 𝑗 𝑥𝑖𝑗 𝛽𝑗
(−𝑖)
• 𝑏
Page 11
Use of the shrinkage factors
•
•
•
•
•
•
𝑦𝑛𝑒𝑤 = 𝛽0 + 𝑏 𝑥𝑖𝑛𝑒𝑤 𝛽
•
Page 12
Sauerbrei‘s (1999) ‚parameterwise shrinkage factors‘
• 𝛽 𝛽(−𝑖)
•
partial 𝜂𝑖𝑗 = 𝑥𝑖𝑗 𝛽𝑗(−𝑖)
• 𝑏𝑗
•
Page 13
Dunkler‘s (2016) extension of parameterwise shrinkage
• 𝑏𝑗
•
•
•
•
• 𝐺 𝜂𝑖𝑔 = 𝑗∈𝐽𝑔𝑥𝑖𝑗
𝛽𝑗(−𝑖)
𝑔 = 1, … , 𝐺
• 𝜂𝑖𝑔 𝑏𝑔, 𝑔 = 1, … , 𝐺
• 𝛽(−𝑖) ≈ 𝛽 − 𝐷𝐹𝐵𝐸𝑇𝐴𝑖
Page 14
Example: deep vein thrombosis study
Page 15
How do shrinkage effects of different methods compare?
•
•
•
• 𝜆
• 𝜆
•
•
Page 24
too pessimistic
too optimistic
Page 26
From bias reduction to shrinkage and beyondJoint work with Rainer Puhr, Angelika Geroldinger, Sander Greenland
Page 27
Setting the scene
𝛽
𝛽
𝜋
𝜋
𝛽 𝜋
Page 28
Firth‘s penalization for logistic regression
𝐿∗ 𝛽 = 𝐿 𝛽 det( 𝐼 𝛽 )1/2,
𝐼 𝛽 𝐿 𝛽
• 𝛽,
•
•
Page 29
Firth‘s penalization for logistic regression
𝐿∗ 𝛽 = 𝐿 𝛽 det(𝑋𝑡𝑊𝑋)1/2
𝑊 = diag expit Xi𝛽 (1 − expit Xi𝛽 )
= diag(𝜋𝑖 1 − 𝜋𝑖 )
•
𝑊 𝜋𝑖 =1
2𝛽 = 0
•1
2,
•
Page 30
Firth‘s penalization for logistic regression
•
•
•
•
Page 31
Firth‘s penalization for logistic regression
•
•
•
Page 32
Firth‘s Logistic regression
1/2
=2
50= 0.04
= 11
=3
52~0.058
= 9.89= 0.054
Page 33
Example of Greenland 2010
320
32
346 6 352
=32
352= 0.091 =
33
354= 0.093
= 2.03 = 2.73
321
33
346.5 6.5 354
Page 34
Greenland example: likelihood, prior, posterior
Page 35
Bayesian non-collapsibility:anti-shrinkage from penalization
•
•
•
Page 36
An even more extreme examplefrom Greenland 2010
•
• 𝛽1 = 0)
•
30
6
30 6 36
Page 37
Simulating the example of Greenland
•
•
•
•
320
32
346 6 352
Page 38
Simulating the example of Greenland
•
𝛽1
𝛽1
𝜷𝟏
𝛽1 −∞
Page 39
Simulating the example of Greenland
•
•
•
Page 40
logF(1,1) prior (Greenland and Mansournia, 2015)
•
𝐿 𝛽 ∗ = 𝐿 𝛽 ⋅ ∏𝑒
𝛽𝑗2
1+𝑒𝛽𝑗
.
∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗
∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗
Page 41
Simulating the example of Greenland
•
𝛽1
𝛽1
𝜷𝟏
𝛽1 −∞
Page 42
Simulating the example of Greenland
•
𝛽1
𝛽1
𝜷𝟏
𝛽1 −∞
Page 43
Other, more subtle occurrencesof Bayesian non-collapsibility
•
•
•
•
Page 44
Simulation of bivariable log reg models
• 𝑋1, 𝑋2~Bin(0.5) 𝑟 = 0.8, 𝑛 = 50
• 𝛽1 = 1.5 𝛽2 = 0.1 𝜆
𝝀
𝛽1
𝛽1
𝛽2
𝛽2
𝜷𝟐
Page 45
Anti-shrinkage from penalization?
•
•
with
• ≠
Page 46
Reason for anti-shrinkage
•
•
•
•
•
Page 47
Example of Greenland 2010 revisited
320
32
346 6 352
321
33
347 7 352
•
Page 48
FLAC: Firth‘s Logistic regression with Added Covariate
=
+
=
Page 49
FLAC: Firth‘s Logistic regression with Added Covariate
𝑖=1
𝑁
𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 + ℎ𝑖
1
2− 𝜋𝑖 𝑥𝑖𝑟 = 0; 𝑟 = 0, … , 𝑝
ℎ𝑖 𝐻 = 𝑊1
2𝑋 𝑋′𝑊𝑋 −1𝑋𝑊1/2
𝑖=1
𝑁
𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 +
𝑖
𝑁
ℎ𝑖
1
2− 𝜋𝑖 𝑥𝑖𝑟 =
=
𝑖=1
𝑁
𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 +
𝑖=1
𝑁ℎ𝑖
2(𝑦𝑖 − 𝜋𝑖) +
𝑖=1
𝑁ℎ𝑖
2(1 − 𝑦𝑖 − 𝜋𝑖) = 0
Page 50
FLAC: Firth‘s Logistic regression with Added Covariate
•
𝑖=1
𝑁
𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 +
𝑖=1
𝑁ℎ𝑖
2𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 +
𝑖=1
𝑁ℎ𝑖
21 − 𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 = 0
ℎ𝑖/2 ℎ𝑖/2
Page 51
FLAC: Firth‘s Logistic regression with Added Covariate
•
𝑖=1
𝑁
𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 +
𝑖=1
𝑁ℎ𝑖
2𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 +
𝑖=1
𝑁ℎ𝑖
21 − 𝑦𝑖 − 𝜋𝑖 𝑥𝑖𝑟 = 0
ℎ𝑖/2 ℎ𝑖/2
Page 52
FLAC: Firth‘s Logistic regression with Added Covariate
Page 54
Simulation study: the set-up
•
•
•
•
Page 55
Other methods for accurate prediction
•
𝐿 𝛽 ∗ = 𝐿 𝛽 det(𝑋𝑡𝑊𝑋)𝜏, 𝜏 = 0.1,
•
•
•
Page 56
Cauchy priors (CP)
•
•
•
•
bayesglm arm.
Page 57
Simulation results
• 𝛽
• 𝛽
•
• 𝜋
Page 58
Predictions: bias RMSE
Page 59
Predictions: bias RMSE
Page 60
Predictions: bias RMSE
Page 61
Predictions: bias RMSE
Page 62
Predictions: bias RMSE
Page 63
Predictions: bias RMSE
Page 64
Predictions: bias RMSE
Page 65
Predictions: bias RMSE
Page 66
Comparison
FLAC
•
•
•
•
•
•
•
Bayesian methods (CP, logF)
•
• m m m
m m
• m
•
•
Ridge
Page 67
Confidence intervals
•
•
• a-priori
•
• 𝛽 ± 1.96 𝑆𝐸)
Page 68
Conclusion
Part 1: Prediction under model uncertainty
•
•
•
•
•
•
•
Part 2: Prediction under sparsity (fixed model)