Download - AWS Webcast - Library Storage Webinar
Library Workloads on Amazon Web Services
July 14th, 2014
Sri Elaprolu ([email protected])
Manager, Solutions Architecture
AWS Overview
Library Workloads and Customer Use Cases
Amazon Storage Services
Solution Overview
Why AWS for Storage and Archiving?
Q&A
Agenda
Compute Storage
AWS Global Infrastructure
Database
App Services
Deployment & Administration
Networking
Amazon Web Services
10 regions
26 availability
zones
51 edge locations
AWS Global Infrastructure
You decide where your apps and data
go!
A
B
C
D
US-EAST-1 Region
Availability Zones (AZs)
AWS Global Infrastructure
Note: Conceptual drawing only. The number of Availability Zones may vary.
Certification and Accreditations
Variable expense
Replace capital
expenditure with variable
expense
Economies of scale
Lower variable expense
than companies can
achieve themselves
Elastic capacity
No need to guess
capacity requirements
and over-provision
Speed and agility
Infrastructure in minutes
not weeks
Focus on mission
Not undifferentiated heavy
IT lifting
Global Reach
Go global in minutes and
reach a global audience
Why are customers moving to AWS Cloud?
Experience
8+ years supporting 100s
of thousands of
customers across 190
countries
Innovation
Rapid delivery of new
services and features based
on customer feedback
Robust Platform
Number of services and
features, virtually to
support every use case
imaginable
Simple Pricing
Philosophy
44 Price reductions
Expect more reductions
in the future
Global Footprint
10 Regions
26 Availability Zones
51 Edge Locations
Eco system
3000 ISVs and 7000 SIs
1500 apps in Marketplace
AWS Differentiators
Dspace
Open Journal Systems
Open Conference Systems
Thesis and Dissertation Systems
Web Properties – WordPress
DuraCloud Preservation System
• Consortium of higher education institutions in
Texas that has provided shared digital library
services since 2005
• The mission of the Texas Digital Library (TDL) is
to enable each of its member libraries to advance
a program of digital initiatives in support of
research, scholarship, and learning.
Online Public
Access Catalogs
Library Catalogs
Online databases
Institutional
Repositories
Online Archive
Intellectual Output
Digital Asset
Storage
Protect from Loss and
Degradation
Offsite Storage
Redundancy and
Durability
Backups
Offsite
Redundant
Development Space
Disposable Environments
Start and Stop Frequently
Library Use Cases
Amazon Storage Services
Data Ingestion Options
AWS Direct Connect Dedicated bandwidth between
your site and AWS
Internet Transfer data in a secure SSL tunnel over the
public Internet
AWS Import/Export Physical transfer of media into and
out of AWS
AWS Ingest Options Internet / One Common Theme: Parallel Uploads
1. Multipart upload
2. Request rate optimization
3. TCP window scaling
4. TCP selective
acknowledgement
AWS has customers that ingest roughly 1 PB per day
AWS Ingest Options AWS Direct Connect
• Private connectivity to AWS – Physical connection – 1 Gbps or 10 Gbps port
• Consistent network performance
• Consider burst models on ingest
• Reduces costs for bandwidth-heavy outbound workloads
US Locations
• CoreSite 32 Avenue of the Americas, NY
• CoreSite One Wilshire & 900 North Alameda, LA
• Equinix DC1 – DC6 & DC10 - DC11, Ashburn, VA
• Equinix SV1 & SV5, San Jose, CA
• Equinix SE2 & SE3, Seattle, WA
AWS Ingest Options AWS Import/Export
• Rapidly move data into
and out of AWS
• Portable storage device
shipment to AWS
• Supports – Amazon EBS
– Amazon S3
– Amazon Glacier
• Use cases – Initial data migration
– Content distribution via portable
devices
– Disaster recovery
AWS Storage and Archive Options
AWS Storage and Archive Options
Amazon Simple Storage Service (S3) Highly scalable object storage
1 byte to 5 TB in size
99.999999999% durability
Amazon Elastic Block Store (EBS) High-performance block storage device
1 GB to 1 TB in size
Mount as drives to instances with
snapshot/cloning functionalities
Magnetic and General Purpose SSD
Amazon Glacier Long-term object archive
Extremely low cost per gigabyte
99.999999999% durability
AWS Storage and Archive Options Amazon Elastic Block Store (EBS)
• High I/O block storage for Amazon
EC2
• Point-in-time snapshots to Amazon S3 • 99.999999999% Durability
• Snapshot software is FREE
• Point-in-time snapshots across
regions
AWS Storage and Archive Options Amazon Simple Storage Service (S3)
• Durable and low cost
• Unlimited number of objects and volume
• Back up to Amazon S3 buckets via
HTTP/HTTPS
– Create scripts using PowerShell,
Perl, Python…
– Numerous solutions for data backup
• Authentication mechanisms ensure data
is kept secure
• Reduced redundancy storage (RRS)
option
• Time: Instant access, any time, any where
• Money: Pay for what you store
• Effort: Scales as you grow
• Quality: 99.999999999% durability
Trillions
of
Objects
AWS Storage and Archive Options Amazon S3: Trillions of Total Objects
AWS Storage and Archive Options Amazon Glacier
• $0.01 per GB/mo, $120 per TB/yr
• 3-5 hour data retrieval latency
• Archives: single file or zipped files
• Vaults: collection of archives
• Infinite archival storage
• 99.999999999% durability
• Immutable, encrypted by default
AWS Storage and Archive Options Object Lifecycle Management: Amazon S3 → Amazon Glacier
• Seamlessly move data from Amazon S3 → Amazon Glacier
• 3-5 hour asynchronous retrieval
• Data lifecycle policies
• $0.01 per GB for Amazon Glacier costs
→
Why AWS for Storage and Archiving?
• Protect digital content from fragility
• Protect digital assets from loss and degradation
• Promote learning
• Share research
TCO: On-Premises Cost Considerations
1. Primary storage hardware (primary / remote
site)
2. Storage growth (cost of upgrades)
3. Storage management software and 3rd party
tools
4. Professional services
5. Hardware maintenance
6. Software maintenance
7. Backup software
8. Backup hardware (primary / remote site)
9. Offsite tape storage / vault
10. Archive software
11. Archive hardware
12. Power
13. Cooling
14. Space
15. Labor
16. Cost of capital
17. Training
18. Asset depreciation
19. Migration
20. Decommission / remove
21. Recycle
22. …
10 TB S3 = $ 3,631.20 / YEAR
5 TB S3 | 5 TB Glacier = $ 2,433.12 / YEAR
10 TB Glacier = $ 1,228.80 / YEAR Correct as of July 11, 2014
Storage on AWS