chuck humphrey, leah vanderjagt and anna bombak university of alberta the winter institute on...

32
Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics for the practitioner

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Chuck Humphrey, Leah Vanderjagt and Anna BombakUniversity of Alberta

The Winter Institute on Statistical Literacy for Librarians

Demystifying statistics for the practitioner

Page 2: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Outline

Introductions Statistics and data: what are we talking about? Definitions and standards Metadata and tools Official statistics Non-official statistics Small area statistics

Page 3: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Introductions: your backgrounds You are equally

split between non-academic and academic libraries.

The largest group, with 11, is from universities other than the U of A.

The second largest group, with nine, is from government libraries.

0 5 10 15

Non-

Academic

AcademicUniversity of Alberta

Other Universities

Public / Special

Government

(11)

(04)

(06)

(09)

Page 4: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Introductions: your backgrounds

Geographically, 22 of you are from Alberta and eight are from other provinces.

We have representation from Halifax to Victoria, although 19 are from the Edmonton region.

0 20

Alberta

OutsideAlberta

(22)

(08)

Page 5: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Introductions: your backgrounds

Please introduce yourself Your name Your institutional affiliation Your librarian responsibilities Is there anything in particular that you are

hoping to learn at this workshop?

Page 6: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics: what are we talking about

Page 7: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics are ubiquitous

“Statistics are generated today about nearly every activity on the planet. Never before have we had so much statistical information about the world in which we live. Why is this type of information so abundant? For one thing, statistics have become a form of currency in today’s information society. Through computing technology, society has become very proficient in calculating statistics from the vast quantities of data that are collected. As a result, our lives involve daily transactions revolving around some use of statistical information.”

Data Basics, page 1.1

Page 8: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Numeric information

Statistics• numeric facts/figures • created from data, i.e,

already processed• presentation-ready

Data•numeric files created

and organized for analysis/processing

• requires processing•not display-ready

Page 9: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Numeric information

Six dimensions or variables in this tableThe cells in the table are the number ofestimated smokers.

Geography

Region

TimePeriods

Unit of Observation Attributes

Smokers

Education

Age

Sex

Page 10: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics are about definitions!

DefinitionsSex

Total

Male

Female

Periods

1994-1995

1996-1997

Page 11: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics are about definitions!

Some definitions are based on standards while others are based on convention or practice.

For example, Standard Geography classifications

Page 12: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics
Page 13: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Numeric information

Page 14: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Stories are told through statistics

The National Population Survey in the previous example had over 80,000 respondents in 1996-97 sample and the Canadian Community Health Survey in 2005 has over 130,000 cases. How do we tell the stories about each of these respondents?

We create summaries of these life experiences using statistics.

Page 15: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Summary

Statistics are derived from data. A table presents a summary or one view

of the data. Tables are structured around geography,

time and attributes of the unit of observation.

Statistics are dependent on definitions.

Page 16: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Life cycle of statistical information

1 Program objective

2 Survey unit organized

3 Questionnaire & sample

4 Data collection

5 Data production & release

6 Analysis

7 Findings released

8 Popularizing findings

9 Needs & gaps evaluation

12

3

4

56

7

8

9

Access to Information

Page 17: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Life cycle of statistical information

1 Program objective

2 Survey unit organized

3 Questionnaire & sample

4 Data collection

5 Data production & release

6 Analysis

7 Official findings released

8 Popularizing findings

9 Needs & gaps evaluation

12

3

4

56

7

8

9

Preserving Information

Page 18: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Life cycle applied to health statistics

1 Program objectives

increased emphasis on health promotion and disease prevention;

decentralization of accountability and decision-making;

shift from hospital to community-based services;

integration of agencies, programs and services; and

increased efficiency and effectiveness in service delivery.

12

3

4

56

7

8

9

Health InformationRoadmap Initiative

Page 19: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Life cycle applied to health statistics

12

3

4

56

7

8

9

Health InformationRoadmap Initiative

2 Survey unit organized

3 Questionnaire & sample

4 Data collection

5 Data production & release

6 Analysis

7 Official findings released

Page 20: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Reconstructing statistics

One way to see the relationship between statistics and the data upon which they were derived is to reconstruct statistics that someone else has produced from data that are publicly accessible.

Page 21: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Reconstructing statistics

12

3

4

56

7

8

9

Health InformationRoadmap Initiative

1 Program objective

2 Survey unit organized

3 Questionnaire & sample

4 Data collection

5 Data production & release

6 Analysis

7 Official findings released

8 Popularizing findings

9 Needs & gaps evaluation

Page 22: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

The statistics that we will reconstruct are reported in “Health Facts from the 1994 National Population Health Survey,” Canadian Social Trends, Spring 1996, pp. 24-27.

The steps we will follow are: identify the variables and cases in the article; identify the data source; locate the variables in the data documentation; find the original questions ; retrieve the data; and run an analysis to reproduce the statistics.

Reconstructing statistics

Page 23: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

The findings to be replicated

Page 26

Page 24: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Summary of variables identified Findings apply to Canadian adults

Likely need age of respondents Men and women

Look for the sex of respondents Type of drinkers

Look for frequency of drinking or a variable categorizing types of drinkers

Age Look for actual age or age in categories

Smokers Look for smoking status

Page 25: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Identify the data source

Survey title is identified: National Population Health Survey, 1994-95

Public-use microdata file is announced

Page 25 of the article

Page 26: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Locate the variables

Examine the data documentation for the National Population Health Survey, 1994-95

PDF version is on-line Use TOC and link to “Data Dictionary for Health” Identify the variables from their content

NOTE: check how missing data were handled Trace the variables back the questionnaire Did sampling method require weighting cases?

NOTE: in addition to the other variables, is a weight variable needed to adjust for the sampling method?

Page 27: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Retrieve and analyze the data

For universities subscribed to the Statistics Canada Data Liberation Initiative (DLI), the public use microdata from the NPHS can be downloaded without additional cost. See the Statistics Canada Online Catalogue for further cost details.

Make use of local data services to retrieve data from the NPHS.

Page 28: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Lessons from the NPHS example

This example demonstrates the distinction between creating statistics and interpreting statistics that have been created by others.

This is an important distinction because: • Choices are made in creating statistics.• Interpreting statistics requires an ability to understand

the choices that were made.

Searching for statistics that others have created can be facilitated by understanding these points.

Page 29: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics are about definitions

Page 30: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics are about definitions

Page 31: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics in the News

Newspaper small group activity In groups of three, find one article in the paper

you are given that makes use of statistics in telling its story. Once you have chosen an article, answer the following questions: What is the concept represented by the statistic

or statistics in this story? Is a definition for this concept provided? if it is,

what is it? Or is the definition implicit? Are the data from which this statistic was derived

identified in the article?

Page 32: Chuck Humphrey, Leah Vanderjagt and Anna Bombak University of Alberta The Winter Institute on Statistical Literacy for Librarians Demystifying statistics

Statistics are about definitions

Look at the Census definitions Definitions are in the Census Handbook and

the Census Dictionary Search by Census Variable under Topic-Based

Tabulations for value categorizations

Look at some standard classifications used in statistics

SIC, NAICS, NOC, Standard Classification of Goods (SCG), Standard Geographic Classification (SGC), Classification of Instructional Programs (CIP), ICD10