reference architectures for repositories and preservation...

30
1 Reference Architectures for Repositories and Preservation Archiving Keith Rajecki Education Solutions Architect Sun Microsystems, Inc.

Upload: phamnhi

Post on 11-Apr-2018

246 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

1

Reference Architecturesfor Repositories and Preservation Archiving

Keith RajeckiEducation Solutions ArchitectSun Microsystems, Inc.

Page 2: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

2

Agenda

• Challenges• Solution Architectures> Open Storage/Open Archive> Cloud Computing • Customer Success Stories• Summary• Next Steps

Page 3: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

3

Challenges

Page 4: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

4

Value of Reference Architectures

•Minimizes cost, complexity and deployment time>Lowers administrative costs via automated data management and migration across storage tiers>Cost-effectively matches the value of data with the appropriately priced media>Economical and power-friendly cost of operation•Flexibility to build performance, economy or mixed archive repository>Infinitely scalable archive management

Page 5: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

5

Sun Reference ArchitecturesDevelop Collaborative, Replicable Reference Architectures

•Fedora•Fedora/Drupal (Islandora)•DSpace•EPrints•Duraspace (Cloud)•Ex Libris Rosetta•VTLS VITAL•SAM/QFS•Internet Archive in a Sun Modular Datacenter •Tessella Safety Deposit Box*

Page 6: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

6

Sun Open Storage Product Positioning

•Digital repositories for data and metadata storage•Fedora, EPrints, and D-Space communities•Ex Libris Rosetta and VTLS VITAL applications

•Large scale preservation projects needing policies

•Digital asset management•eResearch databases

•Federated repositories

SAM/QFS

StorageTek7210

Identity Management and SOA

StorageTek 7410

Page 7: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

7

Virtualized Repository Appliance

Single Virtualized ServerVirtual Machine 1:●Repository●Entity Preservation●Index Creation●Metadata Management●Security●Search Engine

Virtualized Server

Physical Storage

ArchiveApp.

Oracle, MySQLRepository

Solaris + ZFS

Digital Objects

Virtual Machine 2: ●Archive DB●Policies●Metadata

Virtual Machine 3: ●Open Storage Mgmt●Storage Preservation●Physical Storage

Page 8: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

8

Open Repository Tiered Components

Application Server ●Entity Preservation●Metadata Management●Relationship connections●Security●Search Engine●Policy Driven

DB Server ●Digital Asset Policies●Metadata

Digital Objects

Open Storage●Storage Preservation Abstraction ●OpenSolaris, ZFS, SAM● Physical Storage Components●Media Migration

Tape Libraries ●Full Sun portfolio of supported tape libraries

User

Page 9: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

9

QFS – Standalone File System

Metadata DataFC or iSCSI LUNs

TCP/IP

Page 10: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

10

SAM-QFS – Archiving File System

Metadata DataFC or iSCSI LUNs

TCP/IP

Network-Attached

Tape Library

Page 11: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

11

TCP/IP Copy 2

Copy 3 Copy 4Copy 6

Copy 1

Copy 5

NFS File System orAppliance

Solaris

Solaris

SolarisQFS

SAM-QFS

SAM-QFS

SAM Archiving Configuration

Page 12: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

12

Sun’s Infinite Archive System ApproachFactory integrated, solution tested, simple, scalable, and economical

Scalable, Eco-Efficient Tape Tier Options

Tier 1 + Tier 2 Disk IAS Options

Intelligent, Policy-Based Automated Archive

SL8500 SL3000 LibrariesEncryption Access & Capacity Drives

2500 series SATA

IAS GUI, Storage Archive Manager SAM-QFS, Sun Cluster 3.2, Solaris

IP Communications Protocol

Library DB Surveillance WebEmail Unstructured

InfiniteArchiveSystem(IAS)

InfiniteArchiveSystem(IAS)

2500 series SAS

Models: Value and MidrangeX4200 T5220

SL500

Page 13: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

13

Single Node IAS Components

X4540ServerDisk CacheDisk Archive

OptionalMigratorsID Manager

Software

+Internal Tape(SL500/LTO4)

Sun Libraries and Tape can be attached externally for unlimited

scalability

StandardSAM-FSQFSMgmt GUISolaris 10

Sun InfiniteArchive System

Page 14: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

14

Dual Node IAS Components

FC Disk Cache OptionalMigratorsID Manager

Software

+LAN

T5220Servers

Internal SAN

SATA Archive

Optional Internal Tape(SL500/LTO4)

Sun Libraries and Tape can be attached externally for unlimited

scalability

StandardSAM-FSQFSMgmt GUISolaris 10

Sun InfiniteArchive System

Page 15: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

15

Page 16: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

16

Cloud Computing

Page 17: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

17

Emerging Cloud Deployment PatternsTest and

Development

Functional Offload (Batch Processes –

TimesMachine)

Functional Offload (Storage – SmugMug)

Augmentation(Temporary Load – Animoto)

Web Service

Page 18: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

18

Storage Service• What It is

> On-demand, API-based access to storage on the network• Features

> Ability to store and retrieve data as objects or files> REST API with open, AWS S3-like semantics for

object storage> WebDAV for file storage> Fast and inexpensive cloning of objects and files> High availability > Detailed metering of storage used, I/O requests,

bandwidth, etc.• Customer Benefit

> Scalable, highly available storage without big hardware investments

Page 19: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

19

TACC: World’s Top Supercomputer

•The world’s largest largest computing system in the world for open science research•Sun Constellation Linux Cluster and Sun StorageTek Mass Storage Facility•579.4 Tflops peak performance

•Sun Data Center Switch 3456•Dual redundant•110 TB/sec bisectionalbandwidth

•Sun Fire X4600•25 systems•800 cores

•SunBlade 6048•3,936 blades•15,744K CPUs•62,976 cores•125 TB/RAM

•Sun Fire X4500•72 Systems•1.7 petabytes•64.8 GB/sectotal bandwidth

Page 20: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

20

• Customer Challenges•Convert millions of recordings, video & film clips, and photos to digital form•Improve its ability to acquire and provide public access to audiovisual content •Archive for “The life of the republic”

• Solution•A robust storage area network based on Sun disc and tape storage technology•Sun SAM-QFS storage management software

• Business Results•Significantly increase the:>Rate at which it can acquire new content>Amount of content it can store>Time horizon for preserving content

SAM-QFS Recipe for SuccessLibrary of Congress

Page 21: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

21

Customer Success Story

• Digital Asset Management w/Open Text Artesia> Retrieve, reuse, repurpose content realtime> Implemented in 4 months> Improved production productivity 10%-40%

SAM/QFS

Page 22: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

22

• Customer Challenges•Archiving 10 TB of image data per week•Accessing 20 TB of image data per week on a read-basis•Access diagnostic images from any site worldwide as soon as the exam is completed•Complete business continuity ... in the event of a disaster

• Solution•“SAM QFS is a key technology that lets us do some very critical things”

• - Robert Cecil, PhD, Cleveland Clinic’s network director

• Business Results•“Sun's approach with ... SAM-FS and QFS software is the core of our digital imaging storage strategy.”•"Data loss at the institution is so small, that it can’t be measured"•“... a tremendous advantage in terms of data recovery and data availability”

SAM-QFS Customer Snapshot: Healthcare/Life Sciences Cleveland Clinic

http://www.healthimaging.com/index.php?option=com_articles&view=article&id=8528

Page 23: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

23

SAM-QFS Customer Snapshot: Media, EntertainmentMajor League Baseball

• Customer Challenges•Create a high-performance, scalable and available data center infrastructure•Deliver real-time and archived audio and video content•Develop an on-demand digital asset management system

•http://www.sun.com/customers/service/mlbam.xml

• Solution•Two new data centers and a digital asset management system powered by Sun technology•Sun SAM and QFS software manage shared file and archiving capabilities

• Business Results•Over one billion minutes of streaming media•Over 2,000 full-length games •Over one billion visitors•10 million page views per day•No application downtime in two years

Page 24: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

24

• Customer Challenges•Transition 5,000 hours of programming from videotape-to digital storage•Reduce use of costly videotape and tape machine support costs•Enable cost-effective, secure and highly reliable delivery of digital programming to millions of subscribers

• Solution•Standardize on Sun and Grass Valley technology for a server-based storage and play-out system with 99.999 percent system availability •Sun QFS file-sharing software to provide the scalability and powerful performance required to meet HBO's demanding throughput goals

• Business Results•Near-seamless content delivery to broadcast and on-demand subscribers•Digital repository for SD and HD programming eliminated 80% of existing videotape equipment•Significant savings in both labor and maintenance costs

SAM-QFS Customer Snapshot: Media, Entertainment & Internet ServicesHBO

http://www.sun.com/customers/storage/hbo.xml

Page 25: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

25

SAM-QFS Customer Snapshot: GovernmentFederal Ministry of Finance, Germany

• Customer Challenges•Develop and implement automated tariff and local customs handling systems (ATLAS) for customs processing •Provide a secure, scalable, and highly available infrastructure with mirroring capabilities

• Solution•A new three-tier mirroring system architecture with Web browsers leveling the first tier•Data stored on Sun systems 10 km apart with Sun StorageTek 6540 arrays for disaster recovery•Sun QFS file sharing software

• Business Results•Customs clearance is now completed more accurately and much faster than before •Exceeded server expectations and delivered solution six months ahead of schedule •“We have a zero failure rate .....”

Page 26: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

26

Customer Success StoryDigital Content Archiving

● Industry leader in video post-production● Locations in US and EAME● Digital Media Environment

> Managing massive amounts of shared digital content

> View, edit, store uncompressed data between global facilities

● Implemented tiered storage solution from Sun> SAM-QFS, 6540, X4500, SL8500, T10000

● Streamlined digital file-based workflows> Archiving content cost effectively> Generating new revenue streams

Page 27: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

27

Internet Archive

• Gained a reliable and flexible datacenter that supports multiple PB of storage

• Increased storage capacity of its servers

• Reduced space and energy needs for lower costs

• Superior data integrity to guard against data loss

• Rapid time to deployment – Sun MD unit delivered in less than 45 days

Sun Solution Client ResultsKey Requirements• Build a server infrastructure to

support massive amounts of data — 2 PB of storage, growing by 1 PB per year

• Provide an efficient, reliable, and scalable datacenter

• Keep space, energy, management and maintenance costs low

• Web snapshot 100 TB of data - approximately 4 billion Web pages.

• Support up to 500 user queries per second.

• Sun Modular Datacenter S20• Sun Fire X4500 Server• Solaris 10 with ZFS• Sun Remote Operations

Management

Highest Density Integrated Storage Architecture

Page 28: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

28

•Sun Edu Essentials – On-going Discountshttp://www.sun.com/solutions/landing/industry/education/edu_essentials.jsp

•Try & Buy – 60 days up to 40% off Sun Productshttp://www.sun.com/tryandbuy

•Open Archive Architecture Assessment

Next Steps

Page 29: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

29

For More Information•Storage Archive Manager http://www.sun.com/storagetek/management_software/data_management/sam/index.xml/

•Join the Sun Preservation and Archiving Communityhttp://www.sun-pasig.org

•Join the OpenSolaris Storage community http://www.opensolaris.org/os/community/storage/

•Open Storage http://www.sun.com/openstorage

•Open Storage Servershttp://www.sun.com/featured-articles/2008-0709/feature/index.jsp

Page 30: Reference Architectures for Repositories and Preservation ...web.stanford.edu/group/dlss/pasig/PASIG_September2009_DC/GWU... · 1 Reference Architectures for Repositories and Preservation

30

Thank You

Keith RajeckiEducation Solutions ArchitectGlobal Education & ResearchSun Microsystems, Inc.