canadian census 2006 public use microdata file presentation at the sars conference manchester,...

20
Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah, Chief, Census Client Services Census Operations Division Statistics Canada

Upload: aaliyah-medina

Post on 28-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

Canadian Census2006 Public Use Microdata File

Presentation at the SARS Conference

Manchester, United Kingdom

September 3, 2008

Presented by:

Sri Kanagarajah, Chief, Census Client Services

Census Operations Division

Statistics Canada

Page 2: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2006 Census Data(released March 13, 2007 to May 1, 2008)

2006 2001 Growth 2006 2006 2006

Geography Population Population Share Decline Dwellings Land area DensityCanada 31,612,897 30,007,094 100% 5.4 13,576,855 9,017,699 3.5Newfoundland 505,469 512,930 2% -1.5 235,958 370,495 1.4Prince Edward Isl 135,851 135,294 0% 0.4 62,753 5,684 23.9Nova Scotia 913,462 908,007 3% 0.6 425,681 52,917 17.3New Brunswick 729,997 729,498 2% 0.1 331,619 71,355 10.2Quebec 7,546,131 7,237,479 24% 4.3 3,452,300 1,356,367 5.6Ontario 12,160,282 11,410,046 38% 6.6 4,972,869 907,574 13.4Manitoba 1,148,401 1,119,583 4% 2.6 491,724 552,370 2.1Saskatchewan 968,157 978,933 3% -1.1 438,621 588,276 1.6Alberta 3,290,350 2,974,807 10% 10.6 1,335,745 640,045 5.1British Columbia 4,113,487 3,907,738 13% 5.3 1,788,474 924,815 4.4Yukon Territory 30,372 28,674 0% 5.9 15,296 474,711 0.1Northwest Terr 41,464 37,360 0% 11 16,774 1,140,835 0Nunavut 29,474 26,745 0% 10.2 9,041 1,932,255 0

Page 3: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2006 Census DataLargest Metropolitan Areas

Growth

Geography 2,006 Decline1 Toronto (Ont.) 5,113,149 9.22 Montréal (Que.) 3,635,571 5.33 Vancouver (B.C.) 2,116,581 6.54 Ottawa - Gatineau 1,130,761 5.95 Calgary (Alta.) 1,079,310 13.46 Edmonton (Alta.) 1,034,945 10.47 Québec (Que.) 715,515 4.28 Winnipeg (Man.) 694,668 2.79 Hamilton (Ont.) 692,911 4.6

10 London (Ont.) 457,720 5.111 Kitchener (Ont.) 451,235 8.912 St. Catharines - Niagara (Ont.)390,317 3.513 Halifax (N.S.) 372,858 3.814 Oshawa (Ont.) 330,594 11.615 Victoria (B.C.) 330,088 5.8

Page 4: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2006 Census DataMother Tongue

Mother tongue 31,241,030

1 English 18,232,200 2 French 6,970,405 3 Italian 476,905 4 Chinese 467,235 5 German 466,655 6 Panjabi (Punjabi) 382,585 7 Cantonese 369,645 8 Spanish 362,120 9 Arabic 286,790

10 Tagalog (Pilipino) 266,445 11 Portuguese 229,280 12 Polish 217,605 13 Mandarin 173,730 14 Urdu 156,420 15 Vietnamese 146,410

Page 5: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2006 Public Use Microdata File

Agenda

1. Census PUMF 1971 – 20012. Consultations Summary3. Factors promoting new approach4. Challenges with micro data5. Addressing confidentiality concerns6. The 2 New 2006 Census PUMFs7. Comparing 2006 Census PUMFs8. Analytic Content: additions and losses9. International Comparison (macro level)

Page 6: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

1. Census PUMF 1971 - 2001

1.1 Structure and variables

Characteristics

File type : 3 Single separated files with no relationship Geography : Province, Census Metropolitan Area (large urban) Variables : repeated in the 3 files and most of them are

derived variables

Example 2001 Census PUMF

INDIVIDUALS2.7% of the population

140 Variables801,055 Records

FAMILIES2.7% of the population

163 Variables348,104 Records

HOUSEHOLDS & DWELLINGS

2.7% of the population 150 Variables

312,513 Records

Page 7: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

1. Census PUMF 1971 - 2001

1.2 Variables per universe, 2001 Census PUMF

48 %54185325150PUMF of Households

83 %9342252163PUMF of families

6.4 %547055140PUMF of Individuals

% of complex variables

(levels 3-4) OUT OF TOTAL

Level 4Very

Complex derive

Level 3Complex

derive

Level 2Simple derive

Level 1Very simple

derive

Number of variables

2001 Universe

Page 8: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2. Consultations Summary

• Objectives:– Present the upcoming changes for the PUMF 2006– Get feedback concerning the proposed changes– Get information about how often the file is used and about the data

needs of the PUMF users

• Consultations– Data Liberation Initiative: Program to provide colleges and universities

data produced by Statistics Canada including Public Use Microdata File– Federal departments who paid part of the collection costs for the 2B long

form questionnaire– Questionnaire for experienced university researchers

(Queens/UofT/UofAlb) – Academics & Private sector

• Research– International comparison – macro level

Page 9: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2. Consultations Summary (continued) 2.1 Scenarios presented to users

Possible scenarios presented for 2006 PUMF:

1. Status Quo as 2001 – 3 universe files

2. Individual Single File

3. Hierarchical file

Page 10: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2.2 Geography• Wanted provinces and some wanted Census

Metropolitan Areas & Census Subdivisions

2.3 Variables– the variables taken from the questionnaires and

the most common derived variables; – Derived variables : Ex: LICO, POW– Allow more flexibility to create own derived

variables

2. Consultations Summary (continued) Most discussed topics of interest during the consultations

Page 11: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

2. Consultations Summary (continued) Most discussed topics of interest during the consultations (end)

2.4 Type of file:

Most requested file:

a) Individual file with geography of (Province & CMAs)

b) Hierarchical file (link between universes for better analysis)

Page 12: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

3. Factors promoting a new approach

Statistics Canada provides greater accessibility of census data than before.

• Improvement in order to provide more analytic content and a greater use at the national and international levels

Speed up release of PUMFs

Page 13: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

4. Challenges with micro data

Statistics Canada senior managers are concerned about confidentiality

Data confidentiality constraints

Page 14: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

4. Research Data Centre (RDC) Microdata

Across Canada - 15 RDC centres, 9 branches & 26 partners

The entire 2B data and details (100% 2B Questionnaire Data (ethnicity, visible minority, labour force, language, place of work, immigration, income etc)

Years available 2001, 1996 & 1991 2006 Census RDC file - December 2008 Social Sciences and Humanities Research Council

(SSHRC) reviews proposals Statistics Canada senior managers Restricted access with a committee that provides

approval after review each proposal for access

Page 15: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

5. Addressing confidentiality concerns(initial plan)

How are we addressing these concerns?• Number of Files - 3 files to 2 files

• File size (2006) less than (2001)- Individual (800K) & Hierarchical (150K)

• Limited geography – Province for single file & regions for hierarchical file

• Age “variable” - collapse age variable

• Income “variable” - modify income categorisation

Page 16: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

5. Addressing confidentiality concerns

• Independent samples where possible

• Eliminate values with low Canada frequencies

X (Top 6?)X (most?)CMAs

XRegions (at least 1 million people: Atlantic, QC, ON, Prairies, BC, North territories)

XProvinces

XXCanada

2006 Hierarchical file2006 Single fileGeography

Page 17: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

6. The 2 new 2006 Census PUMFs Single file and Hierarchical File

(initial plan)

• Single file (2.7 % of population)– Keep Provinces (legal jurisdiction for education, health etc.)

– Keep most CMAs for diversity studies

– Variables taken from the questionnaire. Users can create their own derived variables

– Release projected for summer 2009

• Hierarchical (1% of population)– Keep some CMAs for diversity studies

– Links the 3 universes (individual, family & household)

– Variables taken from the questionnaire. Users can create their own derived variables

– Release projected for summer 2010

Page 18: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

8. Analytic Content: additions and losses

Reduced category of age and incomeReduced category of age and income

Detailed age & income

Same or lower suppression level as in 2001 (less geographies)

Same suppression level as in 2001Confidentiality

Production projected for summer 2010

Production projected for summer 2009

Certification and productionprojected for end of 2010

Production requirements

Analytic content extended to the three universes

Greater potential for analysis and international comparison

Analytic content extended to the Individual Universe

Analytic content limited to one universe at a time 

Variables taken from the questionnaire so that userscan create their own derived variables

Variables taken from the questionnaire so that userscan create their own derived

variables

Repetition of variablesbetween the 3 universes; complex derived variables

File representative of households; more varied content including all data

Loss of information about families and households

Families and households wellrepresented

Geography more limited toRegions and CMAs (at least pop. 1

million)

Geography limited toprovinces and most major CMAs

Diverse geographies at theprovince and CMA levels

All families and persons in households sampled are included

Some people represent a family or a household

Independent samples of the three universes

Content

PUMF-2006 (Hierarchical File)

PUMF-2006 (Single File)

PUMF-2001 (Status Quo )

Page 19: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

State,

Super-PUMA >=400K

PUMA>=100K

RGR(SAR-H-1), LIDA>=120K(SAR-I-91) , GOR(IL-SAR-01),LA-Eng &Wales,

CA-Scot, PC–NI,UK (SAM-CAMS)

State/Ter,

SR, ASR

Pop>=124K(ASR), 250K(SR)

Prov/Terr,

CMA >=100K Geographic Identifier

1% & 5%

Person

Housing

-SL-HSAR-2001 1%

-IL-SAR-2001 3%

-H-CAMS-2001 1%

-I-CAMS-2001 3%

Basic & Extend 1%

-Person

Family

Dwelling

Not availableHierarchical

- SAM 5% 1- Individual 2.7%

- Family 2.7%

- Household & Housing 2.7%

Single

U.S.A. UK Australia Canada Type

9. International Comparison Similarities & Differences (2001)

Page 20: Canadian Census 2006 Public Use Microdata File Presentation at the SARS Conference Manchester, United Kingdom September 3, 2008 Presented by: Sri Kanagarajah,

Comments and/or questions

Thank you !