5 steps to becoming a jisc ie content provider

a centre of expertise in digital information managementwww.ukoln.ac.uk

5 steps to becoming a JISC IE content provider

Andy Powell

UKOLN, University of Bath

2

The problem space…

• from perspective of ‘data consumer’– need to interact with multiple collections of

stuff - bibliographic, full-text, data, image, video, etc.

– delivered thru multiple Web sites– few cross-collection discovery services (with

exception of big search engines like Google, but lots of stuff is not available to Google, i.e. it is part of the ‘invisible Web’)

• from perspective of ‘data provider’– few agreed mechanisms for disclosing

availability of content

3

The problem(s)…

• portal problem– how to provide seamless discovery across multiple

content providers

• appropriate-copy problem– how to provide access to the most appropriate copy

of a resource (given access rights, preferences, cost, speed of delivery, etc.)

4

A solution…

• an information environment• framework of machine-oriented services

allowing the end-user to– discover, access, use, publish resources across a

range of content providers

• move away from lots of stand-alone Web sites...

• ...towards more coherent whole• remove need for use to interact with

multiple content providers

5

JISC Information Env.• discover

–finding stuff across multiple content providers

• access–streamlining access to appropriate copy

• content providers expose metadata about their content for

–searching–harvesting–alerting

• develop services that bring stuff together–portals (subject portals, media-specific portals,

geospatial portals, institutional portals, VLEs, …)

6

Example scenarios• Integration of local and remote information resources with a

variety of 'discovery' services (for example the RDN subject portals, institutional and commercial portals and personal reference managers) allowing students, lecturers and researchers to find quality assured resources from a wide range of content providers including commercial content providers and those within the higher and further education community and elsewhere.

• Seamless linking from 'discovery' services to appropriate 'delivery' services.

• Integration of information resources and learning object repositories with Virtual Learning Environments (for example, allowing seamless, persistent links from a course reading list or other learning objects to the most appropriate copy of an information resource).

• Open access to e-print archives and other systems for managing

the intellectual output of institutions.

7

A note about ‘portals’• ‘portal’ word possibly slightly misleading• presentation layer will contain lots of

user-focused services…– subject portal– reading list and other tools in VLE– commercial ‘portals’ (ISI Web of

Knowledge, ingenta, etc.)– library ‘portal’ (e.g. Zportal or MetaLib)– SFX service component– personal desktop reference manager (e.g.

Endnote)

8

Technical summary• Z39.50 (Bath Profile), OAI, RSS are key

‘discovery’ technologies...– … and by implication, XML and

simple/unqualified Dublin Core

• portals provide ‘discovery’ services across multiple content providers…

• access to resources via OpenURL and resolvers where appropriate

• Z39.50 and OAI not mutually exclusive• general need for all services to know

what other services are available to them

JISC Information Env.

Broker/Aggregator

Portal Portal

Content providers

End-user

Portal

Broker/Aggregator

Authentication

Authorisation

Collect’n Desc

Service Desc

Resolver

Inst’n Profile

Shared services

Provisionlayer

Fusionlayer

Presentationlayer

9

10

What do you need to do?• support machine oriented (m2m)

interfaces to your content/services• not contentious...

– in line with ‘Web services’ approach

• 5 steps…1. expose your metadata2. share news and alerts3. become an OpenURL source4. become an OpenURL target5. use persistent URLs

11

Allow searching…

• support distributed searching of your content by remote services

• offer Bath Profile compliant Z39.50 target• use Z39.50 to expose simple Dublin Core

metadata about your content• note possible use of SOAP (Simple Object

Access Protocol) in the future

12

Allow harvesting…

• enable remote services to gather your metadata records

• offer Open Archives Initiative repository using the OAI Protocol for Metadata Harvesting

• use OAI-PMH to expose simple DC metadata about your content

• ‘as well as’ OR ‘instead of’ offering Z39.50 target

13

Share news/alerts using RSS

• offer machine-readable news and alerting channel(s)

• news/alerts might include– service announcements– list(s) of new resources

• RSS = RDF Site Summary– simple XML application

• use RSS in addition to existing email alerting

14

Become an OpenURL source

• adopt ‘open’, ‘context-sensitive’ linking in the form of OpenURLs

• add OpenURLs into search results– e.g. SFX buttons next to each result

• support mechanism to associate preferred OpenURL resolver with each user– e.g. cookies or user-preferences database

15

Become an OpenURL target

• allow links back into your services from OpenURL resolvers

• publicise your ‘link-to’ syntax, e.g.– ISSN-based URLs– DOI-based URLs

• support deep-linking direct to resources– direct to resource, OR– indirect via abstract page

16

Use persistent IDs

• Z39.50, OAI-PMH and RSS expose your metadata to other services

• allow deep-linking from metadata (e.g. search results) to resource

• deep-linking URLs should be unique and persistent– possibly based on DOIs

• why? allows long-term use of URLs, e.g. in course reading list

17

Authentication issues

• how do I control access?• same as currently - using Athens (or your

own system)– user challenged on entry to ‘portal’– portal can determine some ‘search’ access

rights from Athens, but you may need to ‘trust’ portal

– ZBLSA portal/pub query mechanism– you retain final control at point of access

18

Branding vs. visibility

• will exposing metadata to external service lead to loss of branding?

• not really - expect external services to carry your branding as ‘quality stamp’– e.g. RSS channel carries your name, URL and

logo

• following URL in search results leads direct to your site– so more visibility rather than less

19

Information flow...

• not just about a one-way low of information - from ‘you’ to ‘us’

• also exposes content within the academic community

• for example...– RDN offers Z39.50 access to 60,000 resource

descriptions (soon to offer SOAP interface)– you can integrate this into your ‘portals’

20

Common sense• Z, OAI and RSS based on metadata ‘fusion’ -

merging metadata records from multiple content providers

• need shared understanding and metadata practice across DNER

• need to agree ‘cataloguing guidelines’ and terminology

• 4 key areas–subject classification - what is this resource about?–audience level - who is this resource aimed at?– resource type - what kind of resource is this?–certification - who has created this resource?

21

A shared problem space

• problems faced by end-users are shared across sectors and communities– student looking for information from variety

of bibliographic sources– lecturer searching for e-learning resources

from multiple repositories– researcher working across multiple data-

sets and associated research publications– a.n.other looking to buy or sell a second-

hand car

IMS Digital Repositories

Repositories

Resource Utilizers

DirectoriesVocabularyCompetencyMetadata

Repositories

Organizations Traders

Acc

ess

Man

agem

ent

MA

NA

GE

RIG

HT

S O

BL

IGA

TIO

NS

CO

NT

RO

L A

CC

ESS

AU

TH

EN

TIC

AT

E

AU

TH

OR

ISE

AU

DIT

Pro

curem

ent

NE

GO

TIA

TE

TR

AD

EM

AK

E P

AY

ME

NT

SEARCH

Learner Creator Infoseeker

AssetsMetadata

DISCOVER

REQUEST

USE Presentation

Mediation

Provision

People

Agent

RE

SOL

VE

Registries

STORE

STORE EXPOSEMANAGE STORE EXPOSEMANAGE

DELIVER

(Query, Browse, Follow Path)ACCESS

GATHER

PUBLISH

MANAGE

ALERT

EXPOSE

5 steps to becoming a jisc ie content provider

Education