comments: the big picture for small areas alan m. zaslavsky harvard medical school

15
Comments: The Big Picture for Small Areas Alan M. Zaslavsky Harvard Medical School

Upload: rudolf-wilson

Post on 03-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Comments: The Big Picture for Small Areas

Alan M. ZaslavskyHarvard Medical School

Thanks to presenters

• 3 interesting talks• Raise significant policy issues

Voting rights tabulation

• Generic approach for beta-binomial modeling– Shrinkage calculations (R. Little)– Approach to quasi-Bayesian estimation for

clustered survey data (D. Malec)• Why jurisdictional classes rather than prior

centered on prediction?– Use of classes predictably biases up or down just

above or below class boundary.– Problem of discreteness/thresholds

Voting rights tabulation

• How ‘general purpose’ is the product?– Inference for point estimate of %– vs inference for P(>5%).

• Presentation of results– Bayes methods → posterior distributions – Present results for multiple inferences?– SAE of aggregates ≠ aggregate of SAEs– Perils of thresholds/discreteness

“Context specificity”• What does it add beyond predictive variance?– Model error worse than a sampling error – why?– Might be better understood as a measure of model-

robustness.• Might not have unambiguous definition– In lead example, should precision of NHIS or BRFSS data

define ‘specificity’? (NHIS-BRFSS association is a model estimate.)

– Depends on which inference:Estimate of absolute levels sensitive to calibrationEstimate of differences/ranking among areas unaffected by

calibration

“Context specificity”• Highlights value of transparency of

methodology– Develop heuristic explanations of components

contributing to estimation and their ‘weights’– “For estimation of XXX …– “Total (predictive) SE is …– “XX% from sampling in BRFSS …– “YY% from estimation of NHIS calibration model…– “ZZ% from model error of covariate model…”

Outcome screening• Prioritizing more global SAE program• Technical concerns– Do methods properly account for sampling variance of

domain proportions?• In this 2-level model, why use ad hoc methods for level-2 variance

estimation?

• Strategic concerns– Consider costs & benefits as well as variances

• Posterior ranking Є {overkill} ?

– Consider families of outcomes, not just individual outcomes• e.g. 12 binomial variables, likely related, for same Asian

population

Current state of SAE• Typically one variable or a few closely related– Relationships only as explicitly selected for models– Not higher-order interactions

• Each major SAE a major project – High-level statistical expertise involved– Takes a long time

• Lack of fully generic methods– (… although principles fairly well established)– Depends on amount & structure of available data,

distributions & relationships, etc.– Often new methods required for each project

Path that extends current methods

• More estimation projects• Elaborate more generic methods– Adapt to various data structures– More use of multilevel structure– Still univariate or low-dimensional

• OK for many…– single-purpose surveys – health care applications (“profiling”)

Some goals for general-purpose surveys

• Generate SAE for all current products– Detailed cross-tabulations– Microdata

• Plausible (not “correct”) for all relationships• Valid presentation of uncertainty• Consistency of all products– Margins and aggregation of estimates

What might this look like?

• Almost certainly requires some form of microdata synthesis– Yields consistency

• Units that look ‘enough’ like real units• Two approaches– “Bottom up” synthesis of units (persons,

households)– “Top down” imposition of constraints on synthetic

samples of real units

Advantages of ‘top-down’ approach• Building from observed units makes high-order

interactions realistic– Otherwise most difficult to model

• Impose constraints via weighting or constrained resampling– Weighting is like predictive mean estimation; properties

more readily controllable properties– Constraints may be from direct estimates, SAE, purely

predictive estimates– Uncertainty via stochastic prediction of constraints and

MI

Previous applications

• Reweighting/Imputation of households for census undercount (Zaslavsky 1988, 1989)

• Reweighting for food stamp microsimulations– “Large numbers of estimates for small areas”

(Schirm & Zaslavsky 1997-2002)– High-order interactions crucial to simulation of

program provisions– Reweight national CPS data to simulate each state

in turn (direct and SAE controls)

Synthesis

• Work will proceed on many fronts– Develop and integrate new data sources– Targeted SAE projects responsive to needs– Advances in dissemination & explication

• Integrate improvements in SAE for marginal (single-variable) estimates into overall synthetic framework.

Thank you!