making sense of usage statistics for online databases challenges, lessons, and strategies in a...

29
Making Sense of Usage Statistics for Online Databases Challenges, Lessons, and Strategies in a Statewide Context ion for Networked Information Task Force Meeting, April 16, 2004, Washington, DC William E. Moen School of Library and Information Sciences <[email protected]> Texas Center for Digital Knowledge University of North Texas Denton, TX 72603 Charles R. McClure School of Library and Information Studies <[email protected]> Information Institute Florida State University John Carlo Bertot Tallahassee, FL 32306 <[email protected]>

Upload: veronica-king

Post on 18-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Making Sense of Usage Statistics for Online Databases

Challenges, Lessons, and Strategies in a Statewide Context

Coalition for Networked Information Task Force Meeting, April 16, 2004, Washington, DC

William E. Moen School of Library and Information Sciences<[email protected]> Texas Center for Digital Knowledge

University of North TexasDenton, TX 72603

Charles R. McClure School of Library and Information Studies<[email protected]> Information Institute

Florida State UniversityJohn Carlo Bertot Tallahassee, FL 32306<[email protected]>

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 2

Overview

Why and what of usage statistics Library of Texas/TexShare experience Data issues The case of metasearch Interpretation issues Management issues Final comments

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 3

Why care about usage statistics

Major investment by libraries and consortia in licensed resources

Usage information as basis for: Securing funding Allocating funding Choosing or deselecting resources Understanding patterns of use

A non-intrusive method for viewing users behaviors Log analysis for user behaviors

Metasearch applications change nature of the usage and utilization of databases

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 4

What statistics to report

At an abstract level: To what extent do users engage with a

resource?• Sessions• Searches

To what extent do users utilize a resource• Viewing result lists• Viewing full records of particular results• Downloading or viewing full text (if applicable)• Downloading data for further processing (if

applicable)

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 5

The issues

Do database vendors provide usable usage data for libraries

Do database vendors provide comparable data

How do libraries deal with heterogeneous data

How to automate analysis and reporting of heterogeneous usage data

How can data be best integrated into library decision making

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 6

TexShare A cooperative program to improve service to

Texans. It maximizes the effectiveness of library expenditures by enabling libraries to: Share staff expertise Share library resources in print and electronic formats Pursue joint purchasing agreements for information

services Encourage cooperative development and deployment

of information resources and technologies

The TexShare database service Statewide licensing of databases for users of

academic, public libraries, and libraries of clinical medicine

Core databases TexSelect databases

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 7

Library of Texas

Envisioned as a service-based virtual library that will enable Texans to search an extensive array of resources

The LOT initiative covers four basic components: Indexing and preserving electronic government

documents Providing a statewide resource discovery service Training librarians on electronic resources Continuing to offer a wide selection of TexShare

databases

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 8

Longitudinal usage analysis

Part of an evaluation project for LOT Detailed analysis and reanalysis of vendor

supplied data (FY01-FY03) Purposes:

Compile database usage and trends Determine the extent to which analysis could

be automated

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 9

The TexShare statistics challenge

Different vendors Multiple database per vendors Changing number of databases per reporting

period Creating comparable usage data

Sessions Searches Documents: Number of full-text downloads

Verification of historical data

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 10

TexShare databases available

Vendor FY01 FY02 FY03

BIG CHALK 1 1

EBSCO 25 23 20

GALE 12 12 10

GROLIER 4 4

HANDBOOK OF TEXAS 1 1 1

NETLIBRARY 1 1 1

OCLC 11 10 10

PROQUEST 5 5 2

R.R. BOWKER 2 2

TETON DATA SYSTEM 1 1 1

TDNET 1 1

Total 63 65 47

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 11

Heterogeneous vendor data

EBSCO Logins Searches Abstract Full Text Articles

Transformation:Logins Sessions

Searches Searches

Full Text Articles Documents

Gale Retrievals Searches Turnaways Views Sessions Total connect time

Transformation:Sessions Sessions

Searches Searches

Views Documents

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 12

Heterogeneous vendor data Proquest

Total visitors Total pageviews Total hits Total bytes transferred Average visitors per day Average pageviews per day Average hits per day Average bytes transferred

per day Average pageviews per

visitor Average hits per visitor Average bytes per visitor Average length of sessions

TransformationTotal Visitors Sessions

“tx” directory tree Page Views * Searches

“index.html” directory Page Views **Documents

* Searches = number of “tx” Page Views / 4

** Documents = number of “index.html” Page Views / 5

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 13

Statistics available per vendor

Sessions Searches Documents Lib. Type

BIG CHALK YES YES YES YES

EBSCO YES YES YES YES

GALE YES YES YES YES

GROLIER YES NO YES NO

NETLIBRARY NO NO YES YES

OCLC YES YES YES YES *

PROQUEST YES YES YES NO

R.R. BOWKER YES YES YES YES

TETON DATA SYSTEM YES YES YES NO

* Statistics provided for public libraries only

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 14

Summary statistics

Academic Public Undifferentiated

Sessions 4,896,326 922,041 111,251

Searches 13,398,506 2,922,277 15,099*

Documents 8,864,893 1,441,872 373,371

Academic Public Undifferentiated

Sessions 6,125,215 1,549,281 269,143

Searches 16,932,824 3,317,900 179,005*

Documents 9,483,005 1,382,223 1,421,540

FY2002

FY2003

* One vendor does not provide search statistics

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 15

Summary stats per vendor

Vendor (databases) Sessions Searches

EBSCO (27) 4,417,429 9,859,812

GALE (12) 1,418,651 6,240,454

GROLIER (4) 69,328 *

NETLIBRARY (1) ** **

OCLC (10) 1,268,667 3,164,049

PROQUEST (5) 616,636 657,486

R.R. BOWKER (2) 112,366 435,339

FY2003

* Grolier doesn’t provide data for searches** NetLibrary doesn’t provide data for sessions and searches

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 16

Data issues

Difficulties in obtaining complete data from vendors Differences among vendors in their definitions (or lack

thereof) of key terms such as sessions, downloads, etc. Problems encountered by TSLAC in maintaining and

collating accurate data from the various vendors Problems encountered when retrieving vendor data from

earlier months and having that data differ from the data originally retrieved from the vendor site

Difficulties in developing and integrating standardized reporting tools across different vendor data to summarize usage across the TexShare databases.

Difficulties in utilizing NISO Library Statistics Standard

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 17

Data issues

BOTTOM LINE: Better to have limited data and use them wisely than to have no data at all!

As research and development continues we will have higher quality data and better methods to insure normalization across vendors.

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 18

Emerging community agreements

Working towards vendor data reporting standards Standards efforts

• NISO Z39.7 Library Statistics standard• ISO 2789 Library Statistics standard• Both focus on e-metric (data element) definitional standards

Library/consortia efforts• ICOLC (International Coalition of Library Consortia)• Guidelines for data elements and reporting

Publisher/vendor efforts• Project COUNTER (Counting Online Usage of Networked

Electronic Services)• Audited data reporting of vendor usage statistics in standard

format• Code of Practice to which vendors adhere and adherence is

independently verified

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 19

Vendor usage E-metrics

From NISO Rejected Sessions (turnaways) Commercial Services Sessions Commercial Services Searches (queries) Commercial Services Full-Content Units

Examined Commercial services descriptive records

examined

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 20

Usage data in metasearch Metasearch

Single search interface to multiple resources Library of Texas Resource Discovery Service

Opportunity to define and control usage statistics in the metasearch application, e.g.: Total Sessions Total Searches Total Searches per Target Resource Total Full Text Downloads Total Link Outs to Target Native Interfaces per Target Resource Total Get It Requests Total ILL Submitted

Interaction with metasearch and vendor statistics What are user behaviors in the metasearch environment?

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 21

LOT Search Interface

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 22

RDS Analyzer

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 23

Interpretation issues What constitutes significant use? What measures point to impact? Approaches:

Annual usage data could compare the statistics per year by library (1300 academic and public libraries)

Annual use data could compare the statistics per year per person (approximately 21.3 million residents)

Cost of the TexShare databases for each fiscal year could be used to determine the average cost per session (or download) for (1) all libraries or by academic or public library, and (2) on a per capita basis for residents of Texas

Assuming the ability to provide a “legal service population” for each academic and public library participating in TexShare, the use data can also be normalized on a per capita basis for each library – which could indicate possible variations in the use (and therefore the cost) for each library

Cost avoidance without regard to usage

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 24

Management issues

Organizing for data management: Who has overall responsibility for data management and who has specific responsibility for Collecting the data Designing and maintaining a management information

system Entering the data in a management information system Verifying data with vendors Analysis of data Reporting data

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 25

Management issues (Con’t)

What is the budget available to support data management of online usage statistics?

Who are the audiences that the usage statistics will be reported to?

Do staff have the necessary skills to collect, analyze, and report this data?

Are there analyses of usage data that should remain internal to the organization?

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 26

Parting Shots

Online database usage statistics are essential for ongoing evaluation of online database services

Basic usage statistics can be used to develop a range of performance measures and indicators for outcomes assessment

Vendors must pay attention to standards compliance in usage stats

The usage statistics can be used to justify or refine database purchases

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 27

YOU CAN DO THIS!

As libraries, consortia, and statewide digital libraries continue to grow and expand they will continue to rely on the provision of online databases.

Ongoing collection, analysis, and reporting of usage statistics is an essential component for collection development as well as overall library services planning.

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 28

E-Metrics Instructional System

Moen, McClure, Bertot CNI Task Force Meeting--April 16, 2004--Washington, DC 29

References The Library of Texas

http://www.tsl.state.tx.us/lot/index.html Library of Texas Resource Discovery Service

http://www.libraryoftexas.org ZLOT Project

http://www.unt.edu/zlot NISO Z39.7 Library Statistics standard

http://www.niso.org/emetrics Project Counter

http://www.projectcounter.org ICOLC (International Coalition of Library Consortia)

http://www.library.yale.edu/consortia/2001webstats.htm E-Metrics Instructional System

http://www.ii.fsu.edu/emis