1 a. demnati and j. n. k. rao statistics canada / carleton university a presentation at the third...

25
1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June 18-21, 2007 Montréal, Québec, Canada June 20, 2007 Linearization Variance Estimators for Survey Data: Some Recent Work

Upload: richard-rankin

Post on 27-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

1

A. Demnati and J. N. K. RaoStatistics Canada / Carleton University

A Presentation at the Third International Conference onEstablishment Surveys

June 18-21, 2007

Montréal, Québec, CanadaJune 20, 2007

Linearization Variance Estimatorsfor

Survey Data: Some Recent Work

Page 2: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

2

Situation looking for a method of variance estimation that

is simple

is widely applicable

has good properties

provides unique choice

for estimators

of nonlinear finite population parameters

defined explicitly or implicitly

using calibration weights

under missing data

using repeated survey

of model parameters

SM, 2004 SM, 2004

SM, 2004

JSM, 2002 and JMS, 2002

FCSM, 2003

Symposium, 2005

of dual frames JSM, 2007

Page 3: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

3

Demnati –Rao Approach

N

General formulation

Finite population parameters

Model parameters

Estimator for both parameters

Variance estimators associated with and are different

N

Page 4: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

4

Demnati –Rao Approach( Survey Methodology, 2004 )

)(ˆ df

TNdd ),...,( 1d

kkk ad /

Write the estimator of a finite population parameter as

with

0ka if element k is not in sample s;

1ka if element k is in sample s;

N

Page 5: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

5

Demnati –Rao Approach( Survey Methodology, 2004 )

with

)()ˆ( zL

A linearization sampling variance estimator is given by

skbfz kk ,|/)( dbb

: variance estimator of the H-T estimator

of the total

)(u kkudU ˆ

kuU

TNbb ),...,( 1b is a (N×1) vector of arbitrary number

Page 6: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

6

Demnati –Rao Approach( Survey Methodology, 2004 )

YN Example – Ratio estimator of

)()()( kkkk xbybXf b

For SRS and

YXXxd

ydX

kk

kk ˆ)ˆ/(ˆ

kkkk eXXxRyXXz )ˆ/()ˆ)(ˆ/(

2112 )()( usNnNu

)()ˆ/()()ˆ( 2 eXXzL

)1/()( 22 nuus ksku

Page 7: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

7

Demnati –Rao Approach( Survey Methodology, 2004 )

YN Example – Ratio estimator of

is a better choice over customary

Royall and Cumberland (1981)

Särndal et al. (1989)

)(z

Valliant (1993)

)(e

Binder (1996)

Skinner (2004)

Page 8: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

8

Demnati –Rao Approach

Also in Survey Methodology, 2004:

Calibration Estimators:

Two-Phase Sampling

the GREG Estimator

the “Optimal” Regression Estimator

the Generalized Raking Estimator

New Extensions:

Wilcoxon Rank-Sum Test

Cox Proportional Hazards Model

Page 9: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

9

Model parameters(Symposium, 2005)

)ˆ()ˆ()ˆ( smsm EVarVarEVar

Finite-population assumed to be generated from a superpopulation model

: model expectation and variance

where f is the sampling fraction. For multistage sampling, the psu sampling fraction plays the role of f.

mm VarE ,

Inference on model parameter

Total variance of :

ss VarE , : design expectation and variance

i) if f ≈ 0 then

ii) if f ≈ 1 then

)ˆ()ˆ( smVarEVar

)ˆ()ˆ( smEVarVar

)()ˆ( zL In case i),

Page 10: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

10

Tkk

Tkkkk ddydd ),(),( 21d

Example: Ratio estimator when y is assumed to be random

skzzxRXX

f

Tkk

Tk

kbk db

,),()1,ˆ)(ˆ/(

|/)(

21

AAbAz

for)/()(ˆkkkk dxdyX

where Ad is a 2×N matrix of random variables with kth column:

)()ˆ( z L

Define

We have )/()()( 12 kkkd dxdXf AT

kkk dd ),( 21d

We get

where Ab is a 2×N matrix of arbitrary real numbers with kth column:T

kkk bb ),( 21b

where is an estimator of the total variance of)(u kTkU duˆ

)( km yE

Page 11: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

11

ttkTk udduu ),cov()(

Estimator of the total variance of

with

when

kTkU duˆ

Note that is an estimator of model covariance

kttkkt aad /

),(cov tkm yy

kkk d vd andT

kk y ),1(v

A variance estimator of is given by k

TkU duˆ

Ttk

tk

tkktkt

tkmkttk d

yyd vvdd

)(

),(cov0

00),cov(

where

),( tkm yyCov

when and when 0),( tkm yyCov 0),(cov tkm yy 0),( tkm yyCov

Page 12: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

12

sm

ktktstskkttkmmtmkkt zzdyyzzd

/)1(),(cov)( ;;;;z

Hence

where

)ˆ/(2; XXzz kmk

= model variance + sampling variance

222 )ˆ/(/)/1( es sXXnNnN

where

)1/()ˆ( 22 nxRyas kkke

and

)ˆ)(ˆ/()( 21; kkkkkkTksk xRyXXyzzz vz

Under SRS,

Page 13: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

13

Under ratio model,

Note: g-weight appears automatically in

,)( kkm xyE

22 )1()ˆ/(/ em snXXnN

Note: remains valid under misspecification of

Hence,

,,0),( tkyyCov tkm X

m )( km yVar

222 1

)ˆ/()( esN

NXX

n

Nz

)ˆ/( XX )(z

and the finite population correction 1-n/N is absent in )(z

Page 14: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

14

Simulation 1: Unconditional performance

kkkk xxy 2/12

}{ ky

k

We generated R=2,000 finite populations , each of size N=393 from the ratio model

where

kx

are independent observations generated from a N(0,1)

are the “number of beds” for the Hospitals population

studied in Valliant, Dorfman, and Royall (2000, p.424-427)

One simple random sample of specified size n is drawn from each generated population

Parameter of interest: XXyE km 2)(

Page 15: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

15

Simulation 1: Unconditional performance

220001

1 )ˆ()ˆ()ˆ(

rrRMMSE

)/(ˆ xyX

)ˆ()ˆ( 20001

1rDRrDR R

Ratio estimator:

We calculated:

Simulated

and its components and s m

Page 16: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

16

Simulation 1: Unconditional performance

Figure 1: Averages of variance estimates for selected sample sizes compared to simulated MSE of the ratio estimator.

Page 17: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

17

Simulation 2: Conditional performance

kx

}{ ky

kkkk xxy 2/12

We generate R=20,000 finite populations , each of size N=393 from the ratio model

using the number of beds as

One simple random sample of size n=100 is drawn from each generated population

Parameter of interest: XX 2

We arranged the 20,000 samples in ascending order of -values and then grouped them into 20 groups each of size 1,000

x

Page 18: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

18

Simulation 2: Conditional performance

Figure 2: Conditional relative bias of the expansion and ratio estimators of X2

Page 19: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

19

Simulation 2: Conditional performance

Figure 3: Conditional relative bias of variance estimators

Page 20: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

20

Simulation 2: Conditional performance

Figure 4: Conditional coverage rates of normal theory confidence intervals based on , and for nominal level of 95%cus

sDR

Page 21: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

21

g-weighted estimating functions: model parameter Generalized Linear Model

))(()( βuβl kkkk y

Linear Regression Model βuβ Tkk )(

Logistic Regression Model )]exp(1/[)exp()( βuβuβ Tk

Tkk

is the solution of weighted estimating equation:

β

β

0 )()(ˆ βlβl kkw

δ Xx kkw

msL )ˆ(β

aaH 1)(

)ˆ( Tkkk Hdw x weightncalibratio

is solution

Special case: (GREG)

Page 22: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

22

Simulation 3: Estimating equations

kx

}{ ky

)(~| kkk Pzy We generated R=10,000 finite populations , each of size N=393 from the model

One simple random sample of size n=30 is drawn from each generated population

Parameter of interest:TT )1,2(),( 10 θ

Population units are grouped into two classes with 271 units k having x<350 in class 1 and 122 units k with x>=350 in class 2

)}exp(1/{)exp()(~ 1010 kkkkk xxpwithpBz Using the number of beds as

leads to an average of about 60% for z)002.,1(),( 10

Post-stratification: X=(271,122)T

)exp( 10 kk z

Page 23: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

23

Simulation 3: Estimating equations

Table 2: DR variance estimator

Parameter No Calibration

Post-stratification

0.0122 0.0123

0.0148 0.01501

0

0

1

Table 1: Monte Carlo Variances

Parameter No Calibration Post-stratification

0.0133 0.0139

0.0161 0.0167

1

0

Table 3: DR naïve variance estimator

Parameter No Calibration

Post-stratification

0.0120

0.0145

Page 24: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

24

Multiple Weight Adjustments

Weight Adjustments for

Units (or complete) nonresponse

Calibration

Due to lack of time, not presented in the talk,

but it is included in the proceeding paper

Page 25: 1 A. Demnati and J. N. K. Rao Statistics Canada / Carleton University A Presentation at the Third International Conference on Establishment Surveys June

25

Concluding Remarks

We provided a method of variance estimation for estimators:

The method

of nonlinear model parameters

using survey data

is simple

has good properties

defined explicitly or implicitly

is widely applicable

provides unique choice

using multiple weight adjustments

under missing data

Thank you Very Much