1 rosetta overview. 2 copyright statement all of the information and material inclusive of text,...

42
Rosetta Overview

Upload: nicholas-garrison

Post on 18-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

1

Rosetta Overview

Page 2: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

3

What is Rosetta?

Rosetta is a complete digital asset management and preservation

solution that addresses the ever-growing need to collect, archive

and preserve the digitally-born and digitized materials stored at

academic institutions, research organizations, and government

institutions, ensuring data integrity and access over time.

Page 3: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

4

Agenda

The Need1

Rosetta Solutions

2 The Challenges

3

Data Model4

5 Who’s Using Rosetta and How

Page 4: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

The Need

Page 5: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

6

Need for Digital Preservation

Today’s world is digital. If a file can’t be opened, probably

the reasons are:

1. Corrupted media

2. Missing rendering application

3. Un-identified file format

Page 6: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

7

Need for Digital Preservation

All Kinds of institutions must preserve & provide

long term access to information

LegalDocuments

Website Archives

MedicalRecords

Research Data

Cultural Heritage

Audiovisual

Digitized Collections

Museums

Page 7: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

The Challenges

Page 8: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

9

Challenges

Active preservation principles:

1) Ensuring bit integrity

2) Ensuring content health • Format viability• Complete metadata• Provenance

3) OAIS compliant system

Page 9: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

10

Challenge 1: Bit Integrity

• Fixity checks determine if data has changed or corrupted

• Basic feature found in asset management as well as

preservation solutions

• Does not guaranty data access – just that it has not changed

Page 10: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

11

Challenge 2: Content Health

• Formats evolve rapidly and become obsolete

• File access requirements:

• Positive ID of format e.g. pdf

• SW application e.g. Acrobat reader

• Complete Metadata:

• Technical metadata (e.g. size, resolution, compression,

etc)

• Descriptive metadata (e.g. author, title, publisher, etc)

• Provenance Metadata

Page 11: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

12

Challenge 3: OAIS Compliant System – The Model

Page 12: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

Rosetta Solutions

Page 13: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

14

OAIS Compliant System – Rosetta

Page 14: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

15

Rosetta Solutions - Key Features

Scalable

Open &Integrative

Ready to useConfiguration

Community DrivenKnowledge Base

ActivePreservation

Flexible Delivery

Page 15: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

16

Rosetta Solutions – Community Knowledge Base

Library of formats with metadata and extraction tools Based on PRONOM global library Formats associated to applications and risks Supports integration with a global library Auto update format library with each SW version

Page 16: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

17

Rosetta Solutions - Active Preservation

Manages preservation planning process from risk to action Allows evaluation and comparison of alternatives Based on best practices and recommended workflows Community knowledge sharing

Execute

Evaluate

Identify

PermanentStorage

OperationalStorage

MigrationAction

……

Page 17: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

18

Rosetta Solutions - Scalable

Proven scalable architecture capable of ingesting and processing millions of files/day

Scale wide and dedicate servers to particular roles

Flexible configuration to allow for growth Failures handled gracefully to minimize

manual intervention

Page 18: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

19

Rosetta Ingest Module – Manual Deposits

Page 19: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

20

Rosetta Solutions - Open & Integrative

Rosetta

SubmissionApps

ILS/CMSSystems

SearchEngines

Plug Ins (validation, migration,

enrichment, etc)

StorageAbstraction

Page 20: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

21

Rosetta Solutions – Submission Applications

• Deposit work flows out of the box

• Automated (ftp, NFS, etc)

• Manual

• SDK (software development kit) with API’s allows building submission

tools to interact with Rosetta deposit module

Automatic Submission App

Publisher (e.g. newspaper)

Rosetta

Page 21: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

22

Rosetta Solutions - ILS/CMS Systems

• Synchronization with ILS / CMS systems

• Interface uses integration standards such as SRU and OAI.

Other ILS

Page 22: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

23

Rosetta Storage Abstraction

Rosetta

Storage Abstraction Layer

NFS NetApp IBM

Rosetta SDK allows to create plugins in order to interact with any storage

PluginPlugin

Plugin

Page 23: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

24

Rosetta Solutions – Search Engines

• Publishing module allows information exchange with external systems

• Allows publishing different object groups in different formats

• Provides a set of API’s and SDK for access

• OAI interface out of the box

Search engine agnostic

Page 24: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

Data Model

Page 25: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

26

PREMIS

• Preservation metadata: implementation strategies

• International working group concerned with developing

metadata for use in digital preservation

• Metadata for intellectual entities, events, agents and rights

• Data model consisted of several entities:• Intellectual entity

• Representation

• File

• Bit-stream

Page 26: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

27

METS

• Ex Libris has a METS profile that will be published and

open.

• Each Intellectual Entity is one METS

• Each representation is a file group

• Structure map is on the representation level

• Metadata stored for all levels descriptive as DMD and

preservation as AMD.

Page 27: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

28

Data Model

Intellectual Entitya coherent set of content that is reasonably described as a unit, for example, a particular book, map, photograph, or database

Representation

1

N

is the set of files, including structural metadata, needed for a complete and reasonable rendition of an Intellectual Entity

File

1

N

is a named and ordered sequence of bytes that is known by an operating system

Bit-Stream

1

NA bit-stream is data within a file that has meaningful common properties for preservation purposes.

Page 28: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

29

JPG JPG JPGPDF

Data Model Example - Book

Intellectual

Entity

JP2 JP2 JP2 TIFF TIFF TIFF

Representation

MasterRepresentation

Modified MasterRepresentation

Access Copy

Page 29: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

30

JPG

Data Model Example - Image

Intellectual

Entity

JP2 TIFF

Representation

MasterRepresentation

Modified MasterRepresentation

Access Copy

Page 30: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

Who’s Using Rosetta and How

Page 31: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

32

Support for Digitization Projects

Bavarian State Library (BSB) - Current mass digitization projects • Public-Private-Partnership with Google

• more than 1 million books (in less than 10 years), more than 300 million pages

• Books printed in the 16th century • 37.000 titles; 7.500.000 pages

Page 32: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

33

Preserving and Managing Local Dissertations

Offering additional alternative platform for non-published materials,

for example: ETH Bibliothek’s e-collection

Page 33: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

34

Special Collections

Ex Libris Ltd., 2010 - Internal and Confidential

Page 34: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

35

Dedicated Web Sites for Special Collections using Primo

Page 35: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

36

Flexible Delivery Mechanism

Page 36: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

37

Preserving Cultural Heritage Collections

National Library of New Zealand’s Royal Ballet Photos

Page 37: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

38

Digitally-Born Collections (Websites)

Ensure the library stays relevant in the digital era

National Library of New Zealand Web Site Harvest

Page 38: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

39

Selected Rosetta Customers

Background

Background

Collections in Rosetta

Key Areas of Collaboration

Zurich, Switzerland

Leading technological

institution

DataCite partners

Wellington, New Zealand

Development partner

Mandate for digital

preservation

Research data

Special collections

Dissertations

Nation’s Cultural heritage

Private collections

Websites

Universi ty

National Library

Page 39: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

40

Selected Rosetta Customers

Background

Background

Collections in Rosetta

Key Areas of Collaboration

Binghamton, NY, USA

Part of the SUNY system

FTE: ~14K students

Staff: 1.5FTE (not dedicated)

Munich, Germany

Service providers for

Bavaria

Part of the Google Books

project

Special collections (Edwin A.

Link collection)

Born digital newsletters

University photographs

Scanned manuscripts and

rare books

Legal deposit documents

Websites

Universi ty

State Library

Page 40: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

41

Selected Rosetta Customers

Background

Background

Collections in Rosetta

Collections in Rosetta

Leuven, Belgium

LIBIS services providers

Replacing DigiTool

Integrating with Aleph and

Primo

Special collections

Faculty papers

e-mails

Video collections

Wellington, New Zealand

Merged with the National

Library

Integrating Archway

Legal documents

Archival collections

Government papers

National Archives

Service Providers

Page 41: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

42

China Rosetta Test Server: rosetta.cceu.org.cn

http://rosetta.cceu.org.cn:1801/deposit

Page 42: 1 Rosetta Overview. 2 Copyright Statement All of the information and material inclusive of text, images, logos, product names is either the property of,

43

Thank You!