the fedora project dlf forum albuquerque, nm november 17, 2003 sandy payette cornell information...

33
The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Post on 22-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

The Fedora Project

DLF ForumAlbuquerque, NM

November 17, 2003

Sandy Payette

Cornell Information Science

Page 2: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

The Fedora Project

Fedora Digital Object Repository System Extensible digital object model Repository System exposed via Web service APIs Scalable, persistent storage for content and metadata Local and remote content Associate services with objects Content versioning

Fedora Use cases Content Management (CMS) Digital Library architecture Digital Asset Management Institutional Repository Scholarly publishing Preservation

Open source software

Page 3: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora History

Research (1997-present) : DARPA and NSF-funded research project at Cornell Reference implementation developed at Cornell

First Application (1999-2001) : University of Virginia digital library prototype Scale/stress testing for 10,000,000 objects

Open Source Software (2002-present): Andrew W. Mellon Foundation granted Virginia and Cornell $1 million

to develop a production-quality Fedora system Fedora 1.0 released in May 2003

Page 4: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora Motivations

Generic model to manage/access heterogeneous content Operations via the digital object abstraction; default disseminator

Extensibility Add new functionality to objects via service associations

Object Lifecycle and preservation Content versioning and event history

Content repurposing Same content in different objects; dynamic transformations

Easy integration with other applications and systems Web services with open APIs Clear separation of server from clients/web user interfaces Does not assume any one workflow or end-user application

Page 5: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Digital Object Model

Page 6: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Persistent ID (PID)

Default Disseminator

System Metadata

Datastream (item)

Digital object identifier

Service view: methods for disseminating content

Internal view: key metadata necessary to manage the object

Content view: Set of data and metadata items

Digital Object Model Architectural View

Datastream (item)

Datastream (item)

Extension

Extension

Page 7: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

PID = uva-lib:100

Default Disseminator

System Metadata

Image (mrsid)

Digital Object Model Simple Example

DC (xml)

Thumbnail (jpeg)

Get ProfileList ItemsGet Item

List MethodsGet DC Record

Get ThumbnailGet Medium

Get HighGet Very High

Image Disseminator

Page 8: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Some Common Use Cases

[Simple Image]

Image Manip + DC graph

[Scholarly Publication]

Document Transformation

Page 9: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Content Versioning

[Demo]

Page 10: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Repository System

software distribution

Page 11: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora 1.2 Software Feature Set Open Fedora APIs

Repository as web services (REST and SOAP bindings); WSDL interface defs

Flexible Digital Object Model Content View: objects as bundle of items (content and metadata) Service View: objects as a set of service methods (“behaviors”) Extensible functionality by associating services with objects

Repository System Core Services: Management, Access/Search, OAI-PMH Storage: XML object store; relational db object cache; relational db object registry Mediation - auto-dispatching to distributed web services for content transformation Auto-Indexing – system metadata and DC record of each object HTTP Basic Authentication and Access Control Built-in disseminator services: XSLT x-form, image manipulation, xml-to-PDF

Content Versioning Automatic version control (saves version of content/metadata when modified) Enables date-time stamped API requests (see object as it looked at a point in time)

Clients Fedora Administrator: GUI client to create/maintain objects Default Web browser interface: search; access objects via default disseminator Command line utilities (batch load, ingest, purge, others) Migration Utility – mass export/ingest

Page 12: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora Repository Service Interfaces

Management Service (API-M) Ingest - XML-encoded object submission Create - interactive object creation via API requests Maintain - interactive object modification via API requests Validate – application of integrity rules to objects Identify - generate unique object identifiers Security - authentication and access control Preserve - automatic content versioning and audit trail Export - XML-encoded object formats

Access Service (API-A and API-A-LITE) Search - search repository for objects Object Reflection - what disseminations can the object provide? Object Dissemination - request a view of the object’s content

OAI-PMH Provider Service OAI-DC records

Page 13: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Client and Web Service Interactions

FedoraRepository

System

External Service Dispatch

Clientapplication

Serverapplication

webbrowser

Clientapplication

user

Fedora Service APIs

user

user

ContentTransform

Service

AP

I

ContentTransform

Service

AP

I

Page 14: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora Mapping to OAIS

FedoraRepository

System

METS 1.2/FO

FOXML

METS 1.3DIDL

Ingest Formats(SIPs)

Export Formats(DIPs)

METS 1.2/FO

FOXML

METS 1.3DIDL

FOXML = Fedora Object XMLDIDL = Digital Item Description Language (MPEG21)

F O X M LArchival Format

(AIP)

R1.3 R1.3

R2.0 R2.0

Page 15: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora Software Distribution Package Open Source (Mozilla Public License) 100% Java (Sun Java J2SDK1.4) Supporting Technologies

Apache Tomcat 4.1 and Apache Axis (SOAP) Xerces 2-2.0.2 for XML parsing and validation Saxon 6.5 for XSLT transformation Schematron 1.5 for validation MySQL and Mckoi relational database Oracle 9i support

Deployment Platforms Windows 2000, NT, XP Solaris Linux

Page 16: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora in Use

Page 17: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Projects using Fedora

University of Virginia: digital library (images, EAD, e-texts)

VTLS: basis for new commercial product (library system)

Indiana University: EVIA Digital Archive (video)

Northwestern: academic technologies (images, art, video, e-texts)

Rutgers University: digital library (e-journals, numeric data)

Tufts University: educational (VUE/concept maps); digital library

Yale University: Electronic Records Archive

New York University: Humanities Computing Group

Page 18: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Sampling of sites using/evaluating Fedora:

JSTOR American Geophysical Union  NSDL at Cornell Cornell Information Technologies British Library National Library of Portugal Society of Biblical Literature National Archives of Australia Office of Defense Resources, Thailand Monash University, Australia Oxford Digital Library

Page 19: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Fedora Downloads since May 2003

Total downloads: 1427 Average downloads per day: 9 # Countries: 32 Types of orgs:

Universities: libraries, IT, departments Software and technology companies Defense/military Banks National libraries and archives Publishers Research labs Library automation vendors Scholarly societies

Page 20: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

design solution

FEDORA is proving to be a flexible application development platform.

Developers may dedicate more time toward building audience specific DL and educational applications.

Content tools and digital resources are more easily shared among DL applications.

Fedora @ Tufts

Slide courtesy of David Kahle

Page 21: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

design challenge

Create a visual tool to assist students and faculty in organizing and creating pathways through local files, digital library resources and WWW content.

Fedora @ Tufts

Slide courtesy of David Kahle

Page 22: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

content maps

container node

file node

relationship

Faculty may sketch out their course content, relationships and pathways through this content using a simple set of moveable objects or nodes.

web resource

notes

Fedora @ Tufts

Slide courtesy of David Kahle

Page 23: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

OKI & FEDORA

Leveraging OKI technical standards will facilitate the sharing, distribution and integration of this new educational tool in educational systems beyond Tufts.

Fedora @ Tufts

Slide courtesy of David Kahle

Page 24: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Images Simple Zoom Layers

Core getThumbnail getThumbnail getThumbnail

getCoverpage getCoverpage getCoverpage

Basic getThumbnail getThumbnail getThumbnail

get Medium getMedium getMedium

getHigh getHigh getHigh

getVeryHigh getVeryHigh getVeryHigh

Hi-Res getRegion getRegion

getViewer getViewer

Layered getRegion

getViewer

Fedora @ Northwestern

Slide courtesy of Bill Parod

[images] [art]

Page 26: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Image dissemination

with Flash zoom viewer

Fedora @ Northwestern

Slide courtesy of Bill Parod

Page 27: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Text Newspaper Book Video transcript

Audio

transcript

Core getThumbnail getThumbnail getThumbnail getThumbnail

getCoverPage getCoverPage getCoverPage getCoverPage

Text getPreview getPreview getPreview getPreview

getTreeView getTreeView getTreeView getTreeView

getChunk getChunk getChunk getChunk

getChunks getChunks getChunks getChunks

getStaticView getStaticView getStaticView getStaticView

getDynamicView getDynamicView getDynamicView getDynamicView

getPrintable getPrintable getPrintable getPrintable

getMaster getMaster getMaster getMaster

Fedora @ Northwestern

Slide courtesy of Bill Parod

Page 28: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Image Map A/V Book News EText

Core

Image

Hi-Res

Layered

Geo

Time

Text

Fedora @ NorthwesternBehaviors by Type

Slide courtesy of Bill Parod

Page 29: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

UVa EAD Collections [Search] [Angelica]

Page 30: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

UVa Images [image]

Page 31: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Future Software Releases

Fedora Object XML (FOXML) Internal storage format; direct expression of Fedora object model Better support for relationships (“kinship” metadata) Better support for audit trail (event history) Format identifiers for dynamic service binding

Shibboleth authentication Policy Enforcement

XACML expression language Fedora policy enforcement module

Web interface for easy content submission Batch object modification utility Administrative Reporting Object Event History (ABC/RDF disseminations) Better support for “collections” New ingest and export formats (METS1.3, DIDL)

December 2003 – December 2004

Page 32: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

Future Development Proposals

Digital Library in a Box Full-featured DL application with “Fedora inside” Optimized for common set of content types

Fedora Power Server Integrity Management Tools Service and link liveness checker Fault Tolerance Mirroring and Replication Peer-to-peer interoperability features Repository clustering Load balancing

Object Creation Tools Workflow applications based on content models Web interface for document/content submission

Page 33: The Fedora Project DLF Forum Albuquerque, NM November 17, 2003 Sandy Payette Cornell Information Science

www.fedora.info

Release 1.2 on December 10, 2003

Questions