the fedora project dlf forum albuquerque, nm november 17, 2003 sandy payette cornell information...
Post on 22-Dec-2015
218 views
TRANSCRIPT
The Fedora Project
DLF ForumAlbuquerque, NM
November 17, 2003
Sandy Payette
Cornell Information Science
The Fedora Project
Fedora Digital Object Repository System Extensible digital object model Repository System exposed via Web service APIs Scalable, persistent storage for content and metadata Local and remote content Associate services with objects Content versioning
Fedora Use cases Content Management (CMS) Digital Library architecture Digital Asset Management Institutional Repository Scholarly publishing Preservation
Open source software
Fedora History
Research (1997-present) : DARPA and NSF-funded research project at Cornell Reference implementation developed at Cornell
First Application (1999-2001) : University of Virginia digital library prototype Scale/stress testing for 10,000,000 objects
Open Source Software (2002-present): Andrew W. Mellon Foundation granted Virginia and Cornell $1 million
to develop a production-quality Fedora system Fedora 1.0 released in May 2003
Fedora Motivations
Generic model to manage/access heterogeneous content Operations via the digital object abstraction; default disseminator
Extensibility Add new functionality to objects via service associations
Object Lifecycle and preservation Content versioning and event history
Content repurposing Same content in different objects; dynamic transformations
Easy integration with other applications and systems Web services with open APIs Clear separation of server from clients/web user interfaces Does not assume any one workflow or end-user application
Digital Object Model
Persistent ID (PID)
Default Disseminator
System Metadata
Datastream (item)
Digital object identifier
Service view: methods for disseminating content
Internal view: key metadata necessary to manage the object
Content view: Set of data and metadata items
Digital Object Model Architectural View
Datastream (item)
Datastream (item)
Extension
Extension
PID = uva-lib:100
Default Disseminator
System Metadata
Image (mrsid)
Digital Object Model Simple Example
DC (xml)
Thumbnail (jpeg)
Get ProfileList ItemsGet Item
List MethodsGet DC Record
Get ThumbnailGet Medium
Get HighGet Very High
Image Disseminator
Some Common Use Cases
[Simple Image]
Image Manip + DC graph
[Scholarly Publication]
Document Transformation
Content Versioning
[Demo]
Repository System
software distribution
Fedora 1.2 Software Feature Set Open Fedora APIs
Repository as web services (REST and SOAP bindings); WSDL interface defs
Flexible Digital Object Model Content View: objects as bundle of items (content and metadata) Service View: objects as a set of service methods (“behaviors”) Extensible functionality by associating services with objects
Repository System Core Services: Management, Access/Search, OAI-PMH Storage: XML object store; relational db object cache; relational db object registry Mediation - auto-dispatching to distributed web services for content transformation Auto-Indexing – system metadata and DC record of each object HTTP Basic Authentication and Access Control Built-in disseminator services: XSLT x-form, image manipulation, xml-to-PDF
Content Versioning Automatic version control (saves version of content/metadata when modified) Enables date-time stamped API requests (see object as it looked at a point in time)
Clients Fedora Administrator: GUI client to create/maintain objects Default Web browser interface: search; access objects via default disseminator Command line utilities (batch load, ingest, purge, others) Migration Utility – mass export/ingest
Fedora Repository Service Interfaces
Management Service (API-M) Ingest - XML-encoded object submission Create - interactive object creation via API requests Maintain - interactive object modification via API requests Validate – application of integrity rules to objects Identify - generate unique object identifiers Security - authentication and access control Preserve - automatic content versioning and audit trail Export - XML-encoded object formats
Access Service (API-A and API-A-LITE) Search - search repository for objects Object Reflection - what disseminations can the object provide? Object Dissemination - request a view of the object’s content
OAI-PMH Provider Service OAI-DC records
Client and Web Service Interactions
FedoraRepository
System
External Service Dispatch
Clientapplication
Serverapplication
webbrowser
Clientapplication
user
Fedora Service APIs
user
user
ContentTransform
Service
AP
I
ContentTransform
Service
AP
I
Fedora Mapping to OAIS
FedoraRepository
System
METS 1.2/FO
FOXML
METS 1.3DIDL
Ingest Formats(SIPs)
Export Formats(DIPs)
METS 1.2/FO
FOXML
METS 1.3DIDL
FOXML = Fedora Object XMLDIDL = Digital Item Description Language (MPEG21)
F O X M LArchival Format
(AIP)
R1.3 R1.3
R2.0 R2.0
Fedora Software Distribution Package Open Source (Mozilla Public License) 100% Java (Sun Java J2SDK1.4) Supporting Technologies
Apache Tomcat 4.1 and Apache Axis (SOAP) Xerces 2-2.0.2 for XML parsing and validation Saxon 6.5 for XSLT transformation Schematron 1.5 for validation MySQL and Mckoi relational database Oracle 9i support
Deployment Platforms Windows 2000, NT, XP Solaris Linux
Fedora in Use
Projects using Fedora
University of Virginia: digital library (images, EAD, e-texts)
VTLS: basis for new commercial product (library system)
Indiana University: EVIA Digital Archive (video)
Northwestern: academic technologies (images, art, video, e-texts)
Rutgers University: digital library (e-journals, numeric data)
Tufts University: educational (VUE/concept maps); digital library
Yale University: Electronic Records Archive
New York University: Humanities Computing Group
Sampling of sites using/evaluating Fedora:
JSTOR American Geophysical Union NSDL at Cornell Cornell Information Technologies British Library National Library of Portugal Society of Biblical Literature National Archives of Australia Office of Defense Resources, Thailand Monash University, Australia Oxford Digital Library
Fedora Downloads since May 2003
Total downloads: 1427 Average downloads per day: 9 # Countries: 32 Types of orgs:
Universities: libraries, IT, departments Software and technology companies Defense/military Banks National libraries and archives Publishers Research labs Library automation vendors Scholarly societies
design solution
FEDORA is proving to be a flexible application development platform.
Developers may dedicate more time toward building audience specific DL and educational applications.
Content tools and digital resources are more easily shared among DL applications.
Fedora @ Tufts
Slide courtesy of David Kahle
design challenge
Create a visual tool to assist students and faculty in organizing and creating pathways through local files, digital library resources and WWW content.
Fedora @ Tufts
Slide courtesy of David Kahle
content maps
container node
file node
relationship
Faculty may sketch out their course content, relationships and pathways through this content using a simple set of moveable objects or nodes.
web resource
notes
Fedora @ Tufts
Slide courtesy of David Kahle
OKI & FEDORA
Leveraging OKI technical standards will facilitate the sharing, distribution and integration of this new educational tool in educational systems beyond Tufts.
Fedora @ Tufts
Slide courtesy of David Kahle
Images Simple Zoom Layers
Core getThumbnail getThumbnail getThumbnail
getCoverpage getCoverpage getCoverpage
Basic getThumbnail getThumbnail getThumbnail
get Medium getMedium getMedium
getHigh getHigh getHigh
getVeryHigh getVeryHigh getVeryHigh
Hi-Res getRegion getRegion
getViewer getViewer
Layered getRegion
getViewer
Fedora @ Northwestern
Slide courtesy of Bill Parod
[images] [art]
Fedora @ Northwestern
Slide courtesy of Bill Parod
Image dissemination
with Flash zoom viewer
Fedora @ Northwestern
Slide courtesy of Bill Parod
Text Newspaper Book Video transcript
Audio
transcript
Core getThumbnail getThumbnail getThumbnail getThumbnail
getCoverPage getCoverPage getCoverPage getCoverPage
Text getPreview getPreview getPreview getPreview
getTreeView getTreeView getTreeView getTreeView
getChunk getChunk getChunk getChunk
getChunks getChunks getChunks getChunks
getStaticView getStaticView getStaticView getStaticView
getDynamicView getDynamicView getDynamicView getDynamicView
getPrintable getPrintable getPrintable getPrintable
getMaster getMaster getMaster getMaster
Fedora @ Northwestern
Slide courtesy of Bill Parod
Image Map A/V Book News EText
Core
Image
Hi-Res
Layered
Geo
Time
Text
Fedora @ NorthwesternBehaviors by Type
Slide courtesy of Bill Parod
UVa EAD Collections [Search] [Angelica]
UVa Images [image]
Future Software Releases
Fedora Object XML (FOXML) Internal storage format; direct expression of Fedora object model Better support for relationships (“kinship” metadata) Better support for audit trail (event history) Format identifiers for dynamic service binding
Shibboleth authentication Policy Enforcement
XACML expression language Fedora policy enforcement module
Web interface for easy content submission Batch object modification utility Administrative Reporting Object Event History (ABC/RDF disseminations) Better support for “collections” New ingest and export formats (METS1.3, DIDL)
December 2003 – December 2004
Future Development Proposals
Digital Library in a Box Full-featured DL application with “Fedora inside” Optimized for common set of content types
Fedora Power Server Integrity Management Tools Service and link liveness checker Fault Tolerance Mirroring and Replication Peer-to-peer interoperability features Repository clustering Load balancing
Object Creation Tools Workflow applications based on content models Web interface for document/content submission