aws re:invent 2016: deep dive on aws cloud data migration services (ent210)
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Marc Trimuschat
AWS Storage Services
November 2016
AWS Data Transfer ServicesData Ingest Strategies into the AWS Cloud
ENT210
Batches and Streams
Direct
Connect
Snowball,
Snowball Edge,
Snowmobile
3rd Party
Connectors
Transfer
Acceleration
Storage
GatewayKinesis Firehose
File
Amazon EFS
Block
Amazon EBS (persistent)
Object
Amazon GlacierAmazon S3 Amazon EC2
Instance Store (ephemeral)
Internet/VPN CloudFront
What is Internet/VPN?
Globally available
Default method of ingesting content into Amazon S3
Simple standards-based (HTTP) connection
Use your existing internet connection
Available in a VPC for VPN connectivity
Acceleration through multipart upload
Data transfer into AWS is free
VPN connections using VPC virtual private gateway•$0.05 per VPN connection-hour
•$0.048 per VPN connection-hour for connections to the Tokyo region
How does Internet/VPN ingest work?
Accelerate data transfer using
multipart upload
Ingest data directly into S3 buckets
with existing internet connectivity
S3 bucket
AWS Region
and
through the console or API
customer
gateway
endpoints
VPN
connection
Internet Internet through VPN +
VPC
What is Transfer Acceleration?
Network- and protocol-based data transfer service
Acceleration of data ingress/egress with S3 buckets
Typically 50% to 300% faster
Feature of S3 enabled at the bucket level
Available in all S3 regions worldwide
No client/server software required
No code changes to your application
No firewall exceptions
Simple pricing model
Ingest & egress with Transfer Acceleration
S3 bucketAWS edge
location
Uploader
Optimized
throughput!
Uses AWS 59 global edge locations
AWS determines best edge location
Data transfer optimized between
edge and customer, and edge and S3
Data is not stored on the edge cache
Customers: Frame.io, Hudl, Viocorp
Problem Statement:• Needed to accelerate customer content ingest into their respective
applications running on AWS
• Existing ingest options were proprietary and too expensive
Use of AWS:• S3 and S3 transfer acceleration for massively scalable ingest
• S3 for storage, CloudFront and S3 transfer acceleration for ingest
Business Benefits: • Global highly distributed data transport available on demand
• Massive scalability and elasticity
• Lower TCO for storage and data transport infrastructure
Accelerating media content uploads to their platforms
S3 BucketAWS Edge
Location
Uploader
Optimized
Throughput!
Amazon
Route 53
Resolve
b1.s3-accelerate.amazonaws.com
HTTPS PUT/POST
upload_files.zip
HTTP/S PUT/POST
“upload_files.zip”
Service traffic flowClient to S3 bucket example
S3 bucket
b1.s3-accelerate.amazonaws.com
EC2 proxy
AWS region
AWS edge location
Customer client
1
2
3
4Data is not cached on the
AWS edge location
Fully managed file transfer acceleration
using all AWS edge locations
Rio DeJaneiro
Warsaw New York Atlanta Madrid Virginia Melbourne Paris LosAngeles
Seattle Tokyo Singapore
Tim
e [h
rs]
500 GB upload from these edge locations to a bucket in Singapore
Public internet
How fast is S3 Transfer Acceleration?
S3 transfer acceleration
What is AWS Direct Connect?
Dedicated, 1 or 10 GE private pipes into AWS
Create private (VPC) or public virtual interfaces to AWS
Reduced data-out rates (data-in still free)
Consistent network performance
At least 1 location to each AWS region
Option for redundant connections
Uses BGP to exchange routing information over a VLAN
Physical connection
• Cross-connect at the location
• Single-mode optical fiber
- 1000Base-LX or 10GBASE-LR
• Potential onward delivery through Direct Connect partner
• Customer router
At the Direct Connect location
CORP
AWS Direct
Connect
Routers
Customer
Router
Colocation
DX Location
Customer
network`
AWS backbone
network
Cross-
connect
Customer
router
Customer’s network
Demarcation
Dedicated port through Direct Connect partner
CORP
AWS Direct
Connect
Routers
Colocation
DX Location
Partner network
AWS backbone
network
Cross-
connect
Customer
router
Partner
network
Access
circuit
Demarcation
Partner
equipment
Hybrid cloud storage expansion:
Amazon EFS through Direct Connect
“Bursting”
File WorkloadsData Migration
into EFS
Amazon EFSOn-Premises AWS Direct Connect
What is AWS Storage Gateway?
Works with your existing applications
Secure and durable storage in AWS
Low latency for frequently used data
Scalable and cost-effective on-premises storage - $.01/GB written to AWS + S3/Amazon Glacier storage fees
Service connecting an on-premises software appliance
with cloud-based storage
Hybrid storage use cases and architectures for
AWS Storage Gateway
Enabling cloud workloadsMove data to AWS storage for Big Data, cloud bursting, or migration
Tiered cloud storageEasily add AWS storage to your on-premises environment
Backup, archive, and disaster recoveryCost effective storage in AWS with local or cloud restore
Storage Gateway hybrid storage solutionsEnables using standard storage protocols to access AWS storage services
Customer Premises
Storage
Gateway
Amazon EBS
snapshots
Amazon
S3
Amazon Glacier
AWS Identity and Access
Management (IAM)
AWS Key Management
Service (KMS)
AWS
CloudTrail
Amazon
CloudWatch
Enterprise
storage
Devices
Application
servers
Storage gateway – Files, volumes, and tapes
File gateway NFS (v3 and v4.1) interface **NEW!**
On-premises file storage backed by Amazon S3 objects
Volume gateway iSCSI block interface
On-premises block storage backed by Amazon S3 with EBS snapshots
Tape gateway iSCSI virtual tape library (VTL) interface
Virtual tape storage in Amazon S3 and Glacier with VTL management
Detail: AWS File Gateway for S3
NFS Interface Elasticity Amazon S3 Bucket
Easy Integration Cloud ScaleCloud Access
What is AWS Snowball?
Petabyte-scale data transport
E-ink shipping
label
Ruggedized case
“8.5G impact”
All data encrypted
end-to-end
Rain- and dust-
resistant
Tamper-resistant
case and
electronics
80 TB
10 GE network
AWS storage migration expansion:
AWS Snowball
Transfer
CapacityIntegration
Regional
Availability
80TB model
HDFS support
3rd party API
HIPAA support
All EXCEPT:
Asia Pacific (Singapore)
Asia Pacific (Seoul)
China (Beijing)
How fast is Snowball?
• Less than 1 day to transfer 200TB via 3x10G connections with 3
Snowballs, less than 1 week including shipping
• Number of days to transfer 200TB via the Internet at typical utilizations
Internet Connection Speed
Utilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 71 141 236 471
50% 36 71 118 236
75% 24 47 225 157
Customer: Scripps Networks Interactive
Problem Statement:• Need storage platform to manage active archive content
• Existing content repository too large to migrate via available
network-based ingest methods
Use of AWS:• S3 and Snowball for massively scalable ingest
• S3 for storage, Glacier for content archive
• Snowball to securely transport existing media content from on-
premises storage and tape vault
Business Benefits: • Petabyte-scale data transport without increased network costs
• Massive scalability and elasticity
• Lower TCO for active archive storage
Active archive transport and archival for digital content provider
Hybrid cloud storage expansion:
AWS Snowball Edge
On-premises
CapacityOn-premises
Integration
On-premises
Compute
Clustered local storage
100TB capacity
NFS and S3-compatible
endpoint
AWS Lambda
support
Hybrid cloud storage ecosystem
BackupAWS Storage Gateway VTL
Direct to Amazon S3
File
SystemsObject Storage
Block Storage
Backup to AWS approaches
Amazon S3
Amazon
GlacierAWS
Direct
Connect
Internet
Amazon S3-IA
Application
servers
Cloud gateway
Local disk
Media
server
Cloud gateway
Application
servers
Backup SW cloud connector
Local disk
Media
server with cloud
connector
Bursting
Migration
Tiering
Accessing EFS
via Direct ConnectStorage
Gateway
Snowball
Edge
3rd Party
Ecosystem
AWS hybrid cloud storage choices
Related Sessions
• Snowball Edge: STG214, ENT211
• Snowmobile: STG214, ENT211
• Storage Gateway: STG213, ENT211