1 1 a two-phase life-cycle model of integrated statistical micro data li-chun zhang statistics...

12
1 1 A two-phase life-cycle A two-phase life-cycle model of model of integrated statistical integrated statistical micro data micro data Li-Chun Zhang Statistics Norway [email protected]

Upload: cecil-warner

Post on 05-Jan-2016

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

1

1

A two-phase life-cycle model of A two-phase life-cycle model of integrated statistical micro dataintegrated statistical micro data

Li-Chun Zhang

Statistics Norway

[email protected]

Page 2: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

2

Register-based statistics & early years of survey sampling

N. Kiær (1895).The

representative method.

ISI Session, Bern.

A. Jensen (ISI-committee, 1924): “When ISI discussed the matter twentytwo years ago, it was the question of the

recognition of the method in principle that claimed most interest. Now it is otherwise. I think I may venture to say that nowadays there is hardly one statistician, who in principle will contest the

legitimacy of the representative method. Nevertheless, I believe that the representative method is capable of being used to a

much greater extent than now is the case.”

20??

J. Neyman (1934). On the two different aspects of the

representative method: The method of stratified

sampling and the method of purposive selection.

JRSS 97, 558-606.

(Source: UNECE 2007)

Page 3: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

3

Survey life cycle from a quality perspective (Groves et al., 2004, Survey Methodology, Figure 2.5)

Construct

Measurement

Response

Edited Response

Target Population

Sampling frame

Sample

PostsurveyAdjustments

Survey Statistic

Measurement Representation

Validity

MeasurementError

ProcessingError

CoverageError

SamplingError

AdjustmentError

Respondents

NonresponseError

Page 4: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

4

A two-phase life-cycle model

-Secondary use-Combination of sources

Page 5: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

5

Single-source primary-phase statistical micro data

Target Concept

Measurement

Response/Registration

Editing

Target Set

Accessible Set

Accessed Set

Observed/Validated Set

Single-sourceMicro Data(Primary)

Measurement(Variables)

Representation(Objects)

Validity

Measurement

Processing

Frame

Selection

Missing/Redundancy

Page 6: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

6

Integrated secondary-phase statistical micro data

Target Concept

Harmonization

Classification

Adjustment

Target Population

Data Linkage

Alignment

Statistical Units

IntegratedMicro Data

(Secondary)

Measurement(Variables)

Representation(Units)

Relevance

Mapping

Compatibility

Coverage

Identification

Unit

Transformation(Object to Unit)

Unit vs. ObjectMeasurement vs. Representation

Missing Values vs. Coverage

Base Unit No. 1

Base Unit No. 2

Base Unit No. N

Composite Unit No. 1

Composite Unit No. 2

Composite Unit No. M

Composite Unit No. 1

Composite Unit No. 2

Composite Unit No. K

m:1 m:1

Composite Unit No. 1 Composite Unit No. 2 Composite Unit No. H

m:1

Page 7: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

7

An illustration of register-based household data:Kongsvinger at the time point of census 2001

Page 8: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

8

Representing unit error by allocation matrix

(Equivalence on row permutation & sequential upper-triangular by definition)

Page 9: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

9

Value matrix (or vector): XStatistics: y = A X

Page 10: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

10

Two more examples of statistics

Page 11: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

11

Results: Statistical uncertainty w.r.t. unit errors

Page 12: 1 1 A two-phase life-cycle model of integrated statistical micro data Li-Chun Zhang Statistics Norway lcz@ssb.no

12

The 20th Century = Survey Sampling

The 21th Century = Data Integration

Welcome to a new age!