overview and best practices for amazon elastic block store - september 2016 webinar series

Post on 11-Apr-2017

1.199 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Neil Gadhok, Sr. Product Manager

September 27th, 2016

Amazon Elastic Block Store (EBS)Overview and Best Practices

What is Amazon EBS?

EBS !=

EBS volumes are used for:

• Boot volumes• SQL DBs• NoSQL DBs• Big Data workloads• Data warehouses• Log processing

What is Amazon EBS?

EBS volume

EC2 instance

What is Amazon EBS?

EBS volume

Availability Zone

AWS region

EC2 instance

• Availability Zone specific

• Persist independently of the EC2 instance

EC2 instance

What is Amazon EBS?

EBSboot

volume

Availability Zone

AWS region

EC2 instance

EBSdata

volume

EBSdata

volume

What is Amazon EBS?

EBS volume

Availability Zone Availability Zone

AWS region

Replica

EBS is designed for:

• 99.999% Availability

• 0.1% to 0.2% Annual Failure Rate (AFR)• 20x better than off the self disk drive which has an AFR of about 4%

• Make sure you use a data protection strategy

What is Amazon EBS?

AWS region

What is an Amazon EBS Snapshot ?

EBS volume

Availability Zone

AWS region

AmazonS3

EBS snapshot

Availability Zone

What can you do with a snapshot ?

EBS volume

Availability Zone

AWS region

AmazonS3

EBS snapshot

Availability Zone

EBS volume

What can you do with a snapshot ?

EBS volume

Availability Zone

AWS region

AmazonS3 EBS snapshot

EBS volume

Availability Zone

AWS region

EBS snapshot

What can you do with a snapshot ?

AWS region

Public data sets available as EBS

snapshots:

Availability Zone

EBS volume

https://aws.amazon.com/public-data-sets/

• Genomic• Census• Global weather• Transportation

What can you do with a snapshot ?

EBS volume

Availability Zone

AWS region

EC2 instance

AMI

EBS snapshot

What is an EBS-Optimized instance ?

EBS volume

Availability Zone

AWS region

EBS-Optimized EC2 instance

What is an EBS-Optimized instance ?

EBS

EC2 instances InternetDatabases

c3.2xlarge

~ 125 MB/s

S3Shared

What is an EBS-Optimized instance ?

EC2 instances InternetDatabases

c3.2xlarge

S3

EBS

EBS-optimized

~ 125 MB/s

• C4 and newer instance families are EBS-Optimized by default• http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html

~ 125 MB/sDedicated to EBS

Shared

What is EC2 instance local store ?

EBS volume

Availability Zone

AWS region

EC2 instance

EC2 local storage:

• Directly attached• Data not-persistent• No snapshot support• No native encryption• Useful for stateless applications

What is EBS Encryption ?

EBSencryption

• Attach both encrypted and unencrypted

• No volume performance impact

• Any current generation instance

• Supported by all EBS volume types

• Snapshots also encrypted

• No extra cost

• Boot and data volumes can be encrypted

EBS Volume Types

A few definitions…IOPS: Input/output operations per second (#)

Throughput: Read/write rate to storage (MB/s)

Latency: Delay between request and completion (ms)

Capacity: Volume of data that can be stored (GB)

I/O size: Size of each I/O to disk (KB)

EBS Volume Types

Hard disk drive (HDD)

Solid state drive (SSD)

EBS Volume Types

General PurposeSSD

gp2Provisioned IOPS

SSD

io1Throughput Optimized

HDD

st1ColdHDD

sc1

SSD HDD

EBS Volume Types: General Purposed SSD (gp2)

General Purpose SSD

gp2

Throughput: 160 MB/s

Latency: Single-digit ms

Capacity: 1 GB to 16 TB

Baseline: 3 IOPS per GB up to 10,000

Burst: 3,000 IOPS (for volumes up to 1 TB)

Great for boot volumes, low latency applications and bursty databases

Burst & Baseline: General Purpose SSD (GP2)IO

PS

0 1 16

1,000

2,000

3,000

8,000

10,000

BASELINE IOPS(Baseline of 3 IOPS/GB)

Burstable to 3,000 IOPS

3 …Volume size (TB)

~ 3334 GB

100 IOPS 300 GB volume

Burst Bucket: General Purpose SSD (GP2)

Max I/O credit per bucket is 5.4M

You can spend up to 3000 IOPS per second

Always accumulating3 IOPS per GB per second

gp2

Baseline performance = 3 IOPS per GB or 100 IOPS

How long can I burst ?

1 8 30 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 9500

100

200

300

400

500

600

700

Volume size in GB

Min

utes

of B

urst

1 hour

10 hours

Volume 300 GB in size can burst at 3,000 IOPS for 43 minutes.

EBS Volume Types: Provisioned IOPS SSD (io1)

Provisioned IOPS SSD

io1

Baseline: 100 to 20,000 IOPS

Throughput: 320 MB/s

Latency: Single-digit ms

Capacity: 4 GB to 16 TB

Ideal for critical applications and databases with sustained IOPS

Scaling Provisioned IOPS SSD (IO1)IO

PS

0 0.4 16

1,000

5,000

10,000

15,000

20,000

3…

MAX PROVISIONED IOPS(Maximum IOPS:GB ratio of 50:1)

Available Provisioned IOPS

Volume Size (TB)

400GB

GP2/IO1: Common use cases

• Cassandra • https://d0.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf

• MongoDB• https://aws.amazon.com/blogs/aws/mongodb-on-the-aws-cloud-new-quick-start-r

eference-deployment/

EBS Volume Type: Throughput Optimized HDD (ST1)

NEW!

Throughput Optimized HDD

st1

Baseline: 40 MB/s per TB up to 500 MB/s

Capacity: 500 GB to 16 TB

Burst: 250 MB/s per TB up to 500 MB/s

Ideal for large block, high throughput sequential workloads

Throughput Optimized HDD – Burst and Base

0.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

100

200

300

400

500

600

Burst Base

Volume Size in TB

Thro

ughp

ut in

MB

/s

320 ST1

500

Burst Bucket: Throughput Optimized HDD (ST1)

Max I/O bucket credit is 1 TB of credit per TB in volume

You can spend up to 250 MB/s per TB

Baseline performance = 40 MB/s per TB

Always accumulating 40 MB/s per TB

st1

Cold HDD

sc1

EBS Volume Types: Cold HDD (sc1)

Baseline: 12 MB/s per TB up to 192 MB/s

Capacity: 500 GB to 16 TB

Burst: 80 MB/s per TB up to 250 MB/s

Ideal for sequential throughput workloads such as logging and backup

Cold HDD – Burst and Base

0.5 1 2 4 6 8 10 12 14 160

50

100

150

200

250

300

Burst Base

Volume size in TB

Thro

ughp

ut in

MB

/s

192

SC1

Burst Bucket: Cold HDD (SC1)

Max I/O bucket credit is 1 TB of credit per TB in volume

You can spend up to 80 MB/s per TB

Baseline performance = 12 MB/s per TB

Always accumulating 12 MB/s per TB

ST1/SC1: Common use cases

• Localytics created a petabyte scale data warehouse (Vertica)

• Confluent wrote about deploying Apache Kafka on these volume types

• Splunk wrote how using these volume types for colder data.

https://aws.amazon.com/blogs/aws/amazon-ebs-update-new-cold-storage-and-throughput-options/

IO Provisioned Volumes Throughput Provisioned Volumes

sc1st1io1gp2

$0.10 per GB $0.125 per GB$0.065 per PIOPS

* All prices are per month and from the us-west-2 region as of September 2016

$0.045 per GB $0.025 per GB

EBS Deep Dive

Performance Dimensions

How is performance measured with EBS ?

• GP2 and IO1 • Performance measured in I/Os per second (IOPS)• Max size of an I/O is 256 KB• Ideal for Random I/Os, low latencies

• ST1 and SC1• Performance measured in Megabytes per second (MB/s)• Max size of an I/O is 1 MB• Optimized for large sequential I/Os, high throughput

How do we count I/Os for GP2 and IO1 ?

When possible, we merge sequential I/Os (up to 256 KB in size)

...To minimize I/O charges on IO1 & maximize burst on GP2

How do we count I/Os for GP2 and IO1 ?

Example 1: Random I/Os• 4 random I/Os (i.e., non sequential I/Os)• Each I/O 64 KB

Up to 256 KBEC2 Instance

EBS

Counted as 4 I/Os

How do we count I/Os for GP2 and IO1 ?

Example 2: Random I/Os• 4 random I/Os (i.e., non-sequential I/Os)• Each I/O 256 KB

Up to 256 KBEC2 Instance

EBS

Counted as 4 I/Os

How do we count I/Os for GP2 and IO1 ?

Example 3: Sequential I/O• 4 sequential I/Os• Each I/O 64 KB

Up to 256 KBEC2 Instance

EBS

Counted as 1 I/O

How do we count I/Os for GP2 and IO1 ?

Example 4: Sequential I/O• 4 sequential I/Os • Each I/O 256 KB

Up to 256 KBEC2 Instance

EBS

Counted as 4 I/Os

How do we count I/Os for ST1 and SC1 ?

• When possible, we merge sequential I/Os (up to 1 MB in size)

• Workloads with primarily large, sequential I/Os perform best on ST1 and SC1

• Ex: Big Data/MapReduce, Hadoop, Kafka, Log Processing, Data Warehouses

How do we count I/Os for ST1 and SC1 ?

Example 1: Small, Random I/Os• 4 random I/Os• Each I/O 64 KB

Up to 1024 KBEC2 Instance

EBS

Counted as 4 I/Os or 4 MB/s of burst

How do we count I/Os for ST1 and SC1 ?

Example 2: Larger Random I/Os• 4 random I/Os• Each I/O 256 KB

Up to 1024KBEC2 Instance

EBS

Counted as 4 I/Os or 4 MB/s

How do we count I/Os for ST1 and SC1 ?

Example 3: Large, Sequential I/Os• 4 sequential I/Os • Each I/O 64 KB in size

Up to 1024 KBEC2 Instance

EBS

Counted as 1 I/O or 1 MB/s

Note: very small I/Os can’t be merged with perfect efficiency

How do we count I/Os for ST1 and SC1?

Example 4: Small Sequential I/Os• 4 sequential I/Os• Each I/O 256KB in size

Up to 1024 KBEC2 Instance

EBS

Counted as 1 I/O or 1 MB/s

How is performance measured with EBS ?

• GP2 and IO1 • Performance measured in I/Os per second (IOPS)• Max size of an I/O is 256 KB• Ideal for Random I/Os, low latencies

• ST1 and SC1• Performance measured in Megabytes per second (MB/s)• Max size of an I/O is 1 MB• Optimized for large sequential I/Os, high throughput

EBS Deep Dive

Encryption

How does EBS encryption work ?

• Data encrypted at rest and in flight• Encryption and decryption of EBS data in the memory of the

server hosting your EC2 instances. • The data key is NEVER saved to disk in plain text• Resources encrypted with a Customer Master Key (CMK)• Two types of CMKs

• Default Key (per account per region)• Custom Key

How does EBS encryption work ?

EBS volume 1

EBS master key

AWS KMS

Data key 1

Data key 2

Data key 3Envelope encryption• Limits exposure risk• Performance• Simplifies key management

EBS volume 2

EBS volume 3

How does EBS encryption work ?

EBS volume 1

EBS master key

AWS KMS

Envelope encryption• Limits exposure risk• Performance• Simplifies key management

EBS volume 2

EBS volume 3

How does EBS encryption work ?

EBS volume 1

EBS master key

AWS KMS

Envelope encryption• Limits exposure risk• Performance• Simplifies key management

EBS volume 2

EBS volume 3

EBS Best Practices

EBS Best Practices

Availability

Use Multiple Availability Zones for Higher Availability

EBS volume

Availability Zone

AWS region

Availability Zone

EBS volume

• EC2 and EBS are availability zone independent services.

Use EC2 Autorecovery

Instancestatus check fails?

REBOOT

Systemstatus check fails?

RECOVER

Instance ID

Instance metadata

Private IP addresses

Elastic IP addresses

EBS volume attachments

Instance retains:

• Limited to C3, C4, M3, M4, R3, and T2 instance types with EBS only storage• https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html

EBS Best Practices

Durability

Take SnapshotsQuiesce I/O1. Database: FLUSH and LOCK tables2. Filesystem: sync and fsfreeze3. EBS: snapshot all volumes

• When CreateSnapshot API returns success, it is safe to resume

• A snapshot of an entire 16 TB volume is designed to take no longer than the time it takes to snapshot a 1 TB volume

Automate Snapshot Creation and Retention

AWS Lambdascheduled event:daily snapshots

EC2 instances

Backup

Retention30 days

Search for instances tagged “Backup”

EC2 Run commands to fsfreeze

Snapshot all attached volumes

Tag snapshots with expire date

1. 2. 3. 4.

Avoid RAIDing for redundancy

• RAID1 halves available EBS bandwidth

• RAID5/6 loses 20 – 30% of usable I/O to parity

• EBS data is already replicated

EBS Best Practices

Performance

Use RAID 0 for performance

You should do RAID0 when your

• Storage requirement > 16 TB

• Throughput requirement > 500 MB/s

• IOPS requirement > 20,000 @ 16K

• Use a stripe size of 128KB or 256KB to maintain a sufficient queue depth when driving high IOPS

Use the right sized EBS-Optimized instance

c3.xlarge

EBS-optimized

~ 62.5 MB/s

Want to send 6,000 16 KB IOPS~ 96 MB/secSelected 2 TB GP2 volume

4,000 16K IOPS

c3.2xlarge

EBS-optimized

~ 125 MB/s8,000 16K IOPS

Want to send 6,000 16KB IOPS~ 96 MB/secSelected 2 TB GP2 volume

• http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html

Understand workload I/O patterns

iostat for Linux

256 sectors x 512 bytes/sector = 128 KiB

perfmon for Windows

Use a Modern Linux Kernel

Stuck around 44 KiB?

CloudWatchConsole

Upgrade your kernel to at least 3.8

For ST1/SC1 increase the read-ahead buffer

• For throughput heavy workloads (e.g. Log processing)

• Configure the read-ahead setting to 1 MB • This is a runtime per volume configuration• This may degrade performance if you are accessing the disk with small random I/Os

• https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html

EBS Best Practices

Security

Create a AWS KMS Master key for EBS

• Control who can use key• Control who can administer key• Resources encrypted with a default key

cannot be shared• Enable AWS CloudTrail for auditing

Initialize Volume for performance sensitive apps

New EBS volume? New EBS volume from snapshot?• Attach and its ready to go • Initialize for best performance

• Random read across volume

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html

Consider re-encrypting when copying encrypted snapshots to another account

EBS encryption:Snapshot Copy

Summary

Use encryption if you need it

Take snapshots,Tag snapshots

Select the right instance for your

workload

Select the right volume for your

workload

Next – Live Q&A

top related