bayesian multivariate logistic regression by sean o’brien and david dunson (biometrics, 2004 )...

14
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Upload: tamsyn-dean

Post on 14-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Bayesian Multivariate Logistic Regressionby

Sean O’Brien and David Dunson

(Biometrics, 2004 )

Presented by Lihan He

ECE, Duke University

May 16, 2008

Page 2: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Univariate logistic regression

Multivariate logistic regression

Prior specification and convergence

Posterior computation

Experimental result

Conclusions

Outlines

Page 3: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Univariate Logistic Regression Model

ixii e

xy

1

1),|1Pr(

Equivalent:

)0(1 ii zy

)'|(~ ii xLz

zi: latent variable

L( ): logistic density

2)(

)(

]1[)|(

z

z

e

ezLlogistic density:

CDF:)(1

1)|(

zL ezF

'1

1)0(1)1Pr(

xLi ezFy

'

'

1)0()0Pr(

x

x

Li e

ezFy

Page 4: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Univariate Logistic Regression Model

Approximation using t distribution

,3/)2(22 3.7set

-8 -6 -4 -2 0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

Logistic density

t density

Page 5: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Multivariate Logistic Regression Model

Binary variable for each output

with

-- marginal pdf has univariate logistic density

, F-1( ) is the inverse CDF of density

Page 6: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Multivariate Logistic Regression Model

Property

The marginal univariate densities of zj, for j=1,…,p, have

univariate logistic form

p=1, reduce to the univariate logistic density

R is a correlation matrix (with 1’s on the diagonal), reflecting the

correlations between zj, and hence the correlations between yj

R=diag(1,…,1), reduce to a product of univariate logistic densities,

and the elements of z are uncorrelated

Good convergence property for MCMC sampling

Page 7: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Multivariate Logistic Regression Model

Likelihood

M-ary variable for each output (ordered)

Assume

Define

)'(1

1),,|Pr(

ijk xiije

Xky

d

kkijkij zky

11 )(1

Page 8: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Prior specification and convergence

or

R: uniform density [-1,1] for each element in non-diagonal position

Page 9: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Posterior Computation

Posterior:

Prior and likelihood are not conjugate

Proposal distribution:

=

Use multivariate t distribution to approximate the multivariate logistic density in the likelihood part.

Importance sampling: sample from a proposal distribution to approximate samples from , and use importance weights for exact inference.

R,

Page 10: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Posterior Computation

Introduce latent variables and z, the proposal is expressed as

Sample and z from the full conditionals since the likelihood is conjugate to prior. ,

Update R using a Metropolis step (accept/reject)

z)

Set with probability

Set otherwise

Page 11: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Posterior Computation

Importance weights for inference

RRR

RR ddy

y

yg

)|,(

)|,(

)|,(),( *

*

weights

Page 12: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Application

Subject: 584 twin pregnancies

Output: small for gestational age (SGA), defined as a birthweight below the 10th percentile for a given gestational age in a reference population.

Binary output, yij={0,1}, i=1,…,584, j=1, 2

Covariates: xij for the ith pregnancy and the jth infant

Page 13: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Application

Obtain nearly identical estimates to the study of AP for the regression coefficients. Female gender (β1), prior preterm delivery (β4, β5) and smoking (β8) are associated

with an increased risk of SGA. Outcomes for twins are highly correlated, represented by R.

Page 14: Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008

Conclusions

Propose a multivariate logistic density for multivariate logistic regression model.

The proposed multivariate logistic density is closely approximated by a multivariate t distribution.

Has properties that facilitate efficient sampling and guaranteed convergence.

The marginals are univariate logistic densities.

Embed the correlation structure within the model.