information geometry of statistical inference with selective sample s. eguchi, ism & guas this...

49
Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Upload: rhoda-logan

Post on 28-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

   Information geometry of Statistical inference with selective sample

S. Eguchi, ISM & GUAS

This talk is a part of co-work withJ. Copas, University of Warwick

Page 2: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

  Local Sensitivity Approximation

for Selectivity Bias.

J. Copas and S. Eguchi

J. Royal Statist. Soc. B, 63 (2001), 871-895. (http://www.ism.ac.jp/~eguchi/recent_preprint.html)

Page 3: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Summary

bounds bleinterpretaGet

for with of modelnear for the

inference theCompare

modelof neiborhoodTubular

analysis ySensitivit

bias Selection

tyIgnorabili

MM

M

Page 4: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Statistical model

),( )....,,(iid

1 xpXX n ~

Statistical model

Probability model

B

xxpBXP )d()()(

dpx RRX ,

Page 5: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Near parametric

),...,( )....,,( 11 nn xxgXX ~

)()),(,KL(min

nOfg

0

exact parametric

non-parametric

near-parametric

Page 6: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

status nalobservatio:

variablecovariate:

variableresponse:

Z

X

Y

XZY

Def. (X, Y, Z) is ignorable

Ignorability

Page 7: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Observational status

0 if )(censoring missing

1 if observed is

indicator. nsering)missing(ce is .1Exm.

Z

ZY

Z

GzyfY

Z

z ,,1for , Z|

process. allocationfor label group is .2 Exm

Page 8: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

1 offrequency

observed of nb: , size sample

1)pr(

index Z missing with responsebinary a be Y

Yf

nN

Y

Binary response with missing

ns

n

f )ˆ1(ˆ variance,

ˆ MLE 2

Page 9: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

01

1

)1(

)1|1pr(

pp

pZY

1

)1)( (1 )1( 0

) (1 1

0 1

model

01

01

ppz

ppz

yy

? ?

? ?

0 1

data

N

N

nfnf

yy

))|1(Z( yPpy

Non-ignorable cases

Page 10: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

yZ

yZ

|0pr

|1prlogVar2

?" small" is small How ! ~

withˆ Compare

)( ˆ~,

~ 2

10

0

ON

nNsn

pfnfp

fp

Small degree of non-ignorability

Page 11: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

zzyyYZ

ZYYZ

f

qp

zfyfxzyf

XZY

11 )1()1( 2'Exm

dim , dim where

),(),()|,(

| 2 Def

Distributional expression

Page 12: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

1)},({Var

0)},({)},({ :),(

),KL( min:

3 Def.

,

zy

ZyzYzy

fgg

YZ

ZY

f

ff

YZYZYZ

Tangent space and neighborhood

Page 13: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)( 2

, KL

)(),()exp(:),(

32

0

Ofg

g

zyy,zfzyg

YZYZ

YZ

YZYZ

Exponential map

Page 14: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Tubular Neighborhood

M

M

Page 15: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

) 1 (

)( )(

),(),,(

),(),( where

),( ),( ),(

2

1

,1

,

2

1

2

1

ij

jkkjfijjif

ZT

q

YT

p

jijiij

vvuu

xzSIvv

ySIuu

vuzvyuzy

zY

Decomposition

Page 16: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

q

j qjjjjj

p

i piiiii

vuvu

vuvuzy

1 1

1 1

....

....,

tangent and normal

Page 17: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Conditional Distribution

0 , argsolve ~

).. ,.,. (

).. ,.,. ( when

).. 1)( ,()|(

).. 1)( ,()|(

*2

1

1*

1

1

*2

1

|

1

2

1

|

|*

ySEI

uu

vv

vuIzfyzg

vuIyfzyg

ZYgz

Tqy

Tpz

qjjjyZYZ

piiizYZY

Page 18: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

,..1

22

22

|,2

2

1

|2

0""

)(

)KL(

))|((logVar 4. Def.

pvizz

Tz

zf

ZΥΥz

YZfz

i

Z

Υ

gf

yzg

Calibration

Page 19: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Rosenbaum’s log odd ratio

)|0(

)0(

)0(

)0( log

yr

r

r

y|r

const

Page 20: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

.2|| ifReject

)1 ,0( )(

testScore

0 ::hypothesis a Testing

0

,1

2

1

0

TH

NzyNT

H

kk

N

k

Counterfactual

Page 21: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Guide line

)(2

2

22power local

0 ::hypothesis a Testing

2

1

2

1

2

1

2

1

0

nNn

N

N

NN

H

Page 22: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)(1 log log

,1 ,

, log,

likelihood-log

1 ,

1,

2

11 1

1,2

1

11

1

22

1

OnN

vyuySI

yfL

zzv

n

k pikiiK

n

k

k

n

k

Non-ignorable missing

Page 23: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

2

1

2

1

2

1

ˆ

,..1

2111

21ˆ

ˆ~ , 1If

0"" )ˆ~

()ˆ~

(

)( ˆ~

, argmax~

MLE |

N

nNnp

O

L

pii

Selectivity region

Page 24: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)(Var,)ˆ,1( )ˆ,(1

where

)(22

)0,ˆ( ),( max

*2*

111

*

32*

2*2

2*

*2*

|

uvyun

u

Onuun

LL

Yfpi

kii

? estimate weCan

Page 25: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

statistics skewness includes 0

poly Hermitte )(

),( ,),N( ),(

Otherwise

0 if unstable is ˆ

ˆ

*31

3

22

1

212*

*

u

yu

yf

u

ii

Unstable or Misspecifying

Page 26: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)())(1( log)( log

))( ( )(

))(( log ),(

) ,1( ,)( ),(

observedfully ,,1;

2

11

,1

1

1 ,

2

1

Oxx

xySI

xyfL

fpxx

Nkx

n

kk

Tn

kk

T

kT

k

n

kkkk

n

kk

TkY

kzkkT

kkT

k

k

Regression formulation

Page 27: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Heckman model

1eXY T

2eXR T

1

1

0

0

2

1

,Ne

e~

0 0

1 0

ZR

ZR

missing is 0

observed is 0

YR

YR

Page 28: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Likelihood

)( )/11

(

1

)0,|(

T

22

T

)(

xxyx

xyrxyf

T

T

xyx

yxrPTT

22 11 ), | 0(

)( )0 ,|E( TT xxrxy

Page 29: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Likelihood analysis

n

ii

Ti xyn,,,L

1

22

)(2

1 log ) (

,)( log)( log1

T

1

N

nii

n

ii xu

i

Ti

ii

xyxu

1

1

12

2

2

where

Page 30: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

),,,( max)(|,,

*

LL

n

s

iT

i

n

i

iT

i

ˆxˆy

K*L

ˆxˆy

K*L

*L*L

1

42

1

3

1

3)((0)

)((0)

0(0) , 0(0)

Profile likelihood of

Page 31: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

dataaudit work Coventry

,income y ) age ,age ,sex 1, ( 2xN = 1435, n = 1323

Page 32: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Skin cancer data

control case

3231 f 259 2 f

130 3 f 288 4 f

Data Melanoma :2 Table

10 nevi

Page 33: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

4321 pppp average bound

.2 .1 .2 .1

.1 .2 .2 .1

.1 .15 .15 .1

.2 .2 .1 .1

.5 .533 .533 .5

.011 0 0

1.580 1.719 .391

.812 .840 .222

.002 1.761 .391

.698 1.02 .063

results Simulation :3 Table

Various pattern of bias

Page 34: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)(),(),()(

model dependence

)(),(),(

modeleffect random

),1, ( ),( |

|

ztezftfg

dttftyfyf

GzyfZY

ZTTZ

TzTYzY

zY ~

Group comparison

Page 35: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)(

),( ,|

),(),( where

)},( 1 ){,(

),()|(

|*

*Y

|||

kT

zzzk

kT

zT

zk

zkYk

YT

z

zTYZΤzY

xd

xd

yfxzY

ztzy

zyyf

tyfzyg

Non-random allocation

Page 36: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

)(

2

1

)(ˆ ~ ˆ~

2

1

2

1

*

1 1

*1

1

1

zkz

kzkz

kzkz

zkk

z

N

k

G

zzkkzk

kzG

z

G

z I

p

pp

daI

Selection bias

),...,(:~

1ˆPlot GGS R

Page 37: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Effect of sentence

Z = 1 prisonZ = 2 community serviceZ = 3 probationY = ratio of reconviction

Logistic model

433221)},|({logit XddZXYE zz

Page 38: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Selectivity regions

Pro

bati

on e

ffec

t

Community service effect0 1‐1

‐11

0

C.I.

Page 39: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

two-group comparison

1 )( sgn ery 2er

)0( 1 1 1 rzy,...,y n

)0( 2 11 rzy,...,y Nn

2 0 1 0 zrzr

Page 40: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

n

i

in

i

i yyN

12

12

2

)1

(log2

)(log

n

ni

in

ni

i yy

12

12

2

)1

(log2

)(

) ( ˆ -1

2ˆ )( ˆ 3

2

Likelihood

Page 41: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Analysis

)(

2

121 yyˆ

2

)ˆ(

)(2

1 )ˆ( var 222

NO

N

Page 42: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

UK National Hearing Survey

The effect of occupational noise

Case (high level noise)

Control  

670 n

1441 n

Response Y is threshold of 3kHz sound

Page 43: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

 

                          

710.31 y

893.30 y

351.0s

52.3t f.) d.209(

Case mean

Control mean

Pooled s. d.

t-statistic

Standard analysis supports high significance

Conventional result

Page 44: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

) ( 39.5.523 )~

( 3 t

29.0if96.1 )~

( 05.0 zt

Non-random allocation

05.0210.05 Nnnzt

30.023.005.0

Page 45: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Future problem

bounds bleinterpretaGet

for with of modelnear for the

inference theCompare

modelof neiborhoodTubular

analysis ySensitivit

bias Selection

tyIgnorabili

MM

M

Page 46: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Arnold,B.C. and Strauss, D.J. (1991) Bivariate distributions with conditionals in prescribed exponential families. J.Roy.Statist.Soc., B, 53, 365-376.

Begg,C.B., Satagopan, J.M. and Berwick, M.(1998) A new strategy for evaluating the impact of epidemiologic risk factors for cancer with application to melanoma. J. Am. Statist. Assoc., 93, 415-426.

Bowater, R.J.,Copas, J.B., Machado, O.A. and Davis, A.C. (1996) Hearing impairmentand the log-normal distribution. Applied Statistics, 45, 203-217.

Chambers, R.L.and Welsh, A.H. (1993) Log-linear models for survey data with non-ignorable non-response. J.Roy.Statist.Soc., B, 55, 157-170.

Copas, J. B.and Li, H. G. (1997) Inference for non-random samples (with discussion). J. Roy. Statist. Soc.,B, 59 ,55-95.

Copas, J.B. and Marshall, P. (1998) The offender group reconviction scale:a statisticalreconviction score for use by probation offers. Applie Statistics, 47, 159-171.

References

Page 47: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Cornfeld,J.,Haenszel,W.,Hammond,E.C.,Lilien eld,A.M.,Shimkin,M.B.and Wyn-der,E.L.(1959) Smoking and lung cancer:recent evidence and a discussion of some questions. J.Nat.Cancer Institute, 22, 173-203.

Davis,A.C.(1995) Hearing in Adults. London:Whurr.

Foster, J.J.and Smith,P.W.F.(1998) Model based inference for categorical survey datasubject to nonignorable nonresponse. J. Roy. Statist. Soc, B, 60, 57-70.

Heckman, J.J.(1976) The common structure of statistical models of truncation,sampleselection and limited dependent variables,and a simple estimator for such models. Ann. Economic and Social Measurement, 5, 475-492.

Heckman, J.J. (1979) Sample selection bias as a specifcation error. Econometrica, 47, 153-161.

Kershaw, C. (1999) Reconvictions of offenders sentenced or discharged from prison in 1994, England and Wales. Home Office Statistical Bulletin, 5/99. London: HMSO.

Page 48: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Lin, D.Y., Pasty, B.M.and Kronmal, R.A.(1998) Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics, 54 ,948-963.

Little, R. J. A. (1985) A note about models for selectivity bias. Econometrica, 53, 1469-1474.

Little,R.J.A. (1995) Modelling the dropout mechanism in repeated-measures studies J. Am. Statist. Assoc., 90, 1112-1121.

Little,R.J.A. and Rubin, D.A.(1987) Statistical Analysis with Missing Data. New York: Wiley.

McCullagh, P. and Nelder, J.A. (1989) Generalize Linear Models. 2nd ed. London:Chapman and Hall.

Rosenbaum, P.R. (1987) Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika, 74 ,13-26.

Rosenbaum, P.R. (1995) Observational Studies. New York: Springer

Page 49: Information geometry of Statistical inference with selective sample S. Eguchi, ISM & GUAS This talk is a part of co-work with J. Copas, University of Warwick

Rosenbaum, P.R. and Krieger,A.M.(1990) Sensitivity of two-sample permutation inferences in observational studies.J.Am.Statist.Assoc., 85, 493-498.

Rosenbaum, P.R. and Rubin,D.B.(1983)Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J. Roy. Statist. Soc., B, 45, 212-218.

Scharfstein, D,O., Rotnitzy, A. and Robins, J. M. (1999) Adjusting for non-ignorable drop-out using semiparametric nonresponse models (with discussion). J. Amer. Statist.Assoc.,94, 1096-1146.

Schlesselman,J.J.(1978)Assessing effects of confounding variables. Am. J. Epidemiology, 108, 3-8.

White,H.(1982)Maximum likelihood estimation of misspecified models. Econometrica, 50, 1-26.