michel séguin dli chief december 2006 the need to liberate the data

26

Upload: leslie-dalton

Post on 12-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data
Page 2: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Michel Séguin

DLI Chief

December 2006

The Need to The Need to Liberate Liberate The DataThe Data

The Need to The Need to Liberate Liberate The DataThe Data

Page 3: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Historically Stats Canada made published data available to public through the DSP

These were regular paper publications and did not include electronic numeric files (ie. Public use micro data files)

Data files were available to researchers at marginal costs

Custom tables were another, more costly, method to access unpublished data

The Need to Liberate The Data

Page 4: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

In 1980’s federal budget cuts resulted in Stats Canada’s increased emphasis on cost recovery

In early 1990’s the cost of public use microdata files underwent a dramatic increase

This pushed most data files became out of reach for majority of academic researchers & students

The Need to Liberate The Data

Page 5: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data A consortium of universities had been created

to gain access to 1986 Census data This idea was well received by STC and led to

a movement within academic community to Liberate the rest of STC’s electronic datafiles

A paper in 1991: “Liberating the Data: Proposal for a Proposal” led to a working group to further investigate this idea

Group made up of reps. from: universities, SSFC, CARL, CAPDU, as well as STC and DSP

Page 6: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data

Champions within both the academic community and Statistics Canada came forth to push this idea

Informal approval was received in 1995 This was followed by the creation of:

An internal STC Steering CommitteeA Project TeamAn External Advisory Committee

Page 7: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data

A Licence Agreement was drafted and approved

Author divisions were asked to provide their data to the Initiative

Institutions were invited to join the initiative

Other Gov’t agencies became involved and formal approval for 5-year pilot received from Treasury Board in early 1996

Page 8: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data Use of Internet as a dissemination tool

seen as a key component of initiative Established mechanisms for

communications, storage, finding and ordering data

Created an FTP Site at STC DLILIST - a forum for questions and sharing

of information WWW DLI ORDER DESK - for placing

orders for products not on the FTP site Began disseminating files in 1996

Page 9: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data Before DLI about 15 institutions offered a

data service Therefore co-operative training of

members was seen as an extremely important aspect due to varying degrees of experience of members.

Established a training committee and began to develop a curriculum, identify trainers, establish budgets

Regional training workshops started in 1997

Page 10: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data Training workshops have been given in

each region on an annual basis since then

One suggestion was to have another Orientation session for new members who missed the one in 1997

This workshop and this special Orientation session part of continuing co-operative training

Page 11: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data In 1996 there were 50 post secondary

members In 1998 there were 61 Today there are 70 members There are over 19,000 files in the DLI

collection including data files, documentation, CD’s etc.

Can now access the collection via DLI Web Site as well as FTP

Page 12: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Need to Liberate The Data

The DLI is now a permanent program at Stats Canada located within the Library and Information Centre

Today’s graduates have had the opportunity to use Canadian data throughout their studies

The DLI has been described as one of the most important developments in the social sciences in Canada for the past 50 years!

Page 13: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

What is What is

The The

Data Liberation Data Liberation Initiative?Initiative?

Page 14: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

The Products

The Licence

The Service

The Community

Page 15: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

DLI provides access to Stats Canada data produced as standard electronic products available to the public

These data are digitally encoded and stored in a file structure

These include:Micro data Files Geography FilesDatabasesAggregate data in table format

THETHE PRODUCTSPRODUCTS

Page 16: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Main focus of DLI Collection on Socio-Economic data:HealthEducation, LiteracyLabour Market, IncomeTravel JusticeCensus, DemographicEtc.

THETHE PRODUCTSPRODUCTS

Page 17: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Not usually produced as a standard electronic product for public dissemination

DLI includes some business products such as:Trade dataFinancial Performance Indicators CD Inter-Corporate OwnershipFleet ReportSurvey of Manufacturing

THETHE PRODUCTSPRODUCTS

Page 18: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Standard Electronic Product

THETHE PRODUCTSPRODUCTS

An “off the shelf ” electronic product available to the public

Not included are standard publications available in electronic form as these are usually part of DSP

Registered in STC Catalogue of Products and Services and has a Product Number

Page 19: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Metadata available in both Official Languages whenever available

New data products continually being added to Collection

Includes:Updated data from regular on-going

surveysData from ad-hoc special surveys -one

time onlyData from new surveys in STC program

THETHE PRODUCTSPRODUCTS

Page 20: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Updates may be provided in different format than earlier version:For example PUMF Beyond 20/20

As new versions are received the DLI has to decide to either replace data or add to Collection

Over 19,000 files in Collection including:Data filesMetadata & Readme filesCensus & GeographyCD’s

THETHE PRODUCTSPRODUCTS

Page 21: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Not all products in DLI Collection are standard electronic products

Have some “special” products just for DLI which contain non-public data:

KLEMS databaseAn experimental database of productivity

data Justice Statistics

Complete set of Beyond 20/20 tables normally only available to members of CCJS Initiative

THETHE PRODUCTSPRODUCTS

Page 22: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

DLI is open to all accredited Post Secondary Institutions in Canada

Data made available on a subscription basis All member institutions must sign a Licence

Agreement Data made available to Educators, Students

and Other Staff while they have such status at the InstitutionE.g.. A student who goes to USA to do

Masters no longer has access to data

THETHE LICENCELICENCE

Page 23: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Data is made available for:Academic Research and PublishingTeachingPlanning of academic/educational services

Use of data in textbooks falls under a different set of STC licences and permissions

Data not to be used in any commercial or private activities (even if no $$ involved)

DLI Contact responsible to ensure eligible use of data

THETHE LICENCELICENCE

Page 24: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

Important elements of the Licence Agreement:

Data & products offered “as is “ STC remains owner of intellectual property

- only access to data is provided Users must not link data or otherwise try to

identify individual respondents DLI Contact to implement data security

measures May have users sign before allowing access

THETHE LICENCELICENCE

Page 25: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

DLI was conceived to be a Internet based means of dissemination - internet the main mode of data transfer and communications

DLI Team offers both an FTP and a Web based service for access to Collection

DLILIST - forum for making enquires, sharing of information and general communication between and among members

DLIORDER & WWW DLI ORDER DESK - processes to order hard copy versions of products not available electronically

THETHE SERVICESSERVICES

Page 26: Michel Séguin DLI Chief December 2006 The Need to Liberate The Data

There are a number of advantages to belonging to DLI:

The DLI provides academic community with “one stop shopping” for STC products

Provide a forum for sharing information and obtaining advice

Value added to basic STC products (e.g. SPSS)

Participation in training workshops also a great “community builder”

THETHE COMMUNITYCOMMUNITY