aws re:invent 2016: running lean architectures: how to optimize for cost efficiency (arc313)
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Markus Ostertag, Head of Development, Team Internet AG, @osterjour
Constantin Gonzalez, Principal Solutions Architect, AWS, @zalez
November 30, 2016
Running Lean ArchitecturesHow to Optimize for Cost Efficiency
ARC313
What you’ll get out of this session
• Best practices on how to lower your AWS bill
• A more scalable, robust, dynamic architecture
• More time to innovate
• Real-world customer examples
• Easy to implement
Structure
Business
Architecture
Operations
Business Goals
“Pay as little as possible
for what we use.”
AWS pricing philosophy
Ecosystem
Global Footprint
New Features
New Services
More AWS
Usage
More
Infrastructure
Lower
Infrastructure
Costs
Reduced
Prices
More
CustomersInfrastructure
Innovation
57 price reductions
since 2006
Economies
of Scale
AWS TCO calculator
AWS simple monthly calculator
AWS billing alerts
AWS billing console
AWS Trusted Advisoraws.amazon.com/premiumsupport/trustedadvisor/
Free with Business or Enterprise Support
Free Trusted Advisor Trial!
• Free trial begins on 12/6/16
• Runs for 30 days
• Full suite of checks and best practice
recommendations available
• For customers not already on
business/enterprise support plans
• No action required:
Just log in and start using!
https://console.aws.amazon.com/trustedadvisor
Reserved Instances
Reserved Instances basics
1y RI
Break even
3y RI
Break even
Markus OstertagHead of Development @ Team Internet
Who is Team Internet?
Domain monetization business
30 people
HQ in Munich, Germany
Tech focused
Reserved Instances
Reserved Instances – our usage
Use the regional benefit
Change existing Reserved Instances to scope “Region”
Capacity reservation decoupled from cost optimization
Convertible Reserved Instances
Use case: “(Very) Long running, but flexible”
Architecture Goals
“Avoid waste as much
as possible.”
Turn off unused instances
• Developer, test, training instances
• Use simple instance start and stop
• Or tear down and build up all together
using AWS CloudFormation
• Instances are disposable!
Customer example
Monday Friday End of Vacation Season35% saved
Automate, automate, automate
• AWS SDKs
• AWS CLI
• AWS CloudFormation
• AWS OpsWorks
• Netflix Janitor Monkey
• Cloudlytics EC2 Scheduler
• Auto Scaling
How Auto Scaling works
AWS CloudFormation example launch configuration
"LaunchConfig": {
"Type" : "AWS::AutoScaling::LaunchConfiguration",
"Metadata" : {
"AWS::CloudFormation::Init" : {
"config" : {
… packages, sources, files, services …
}
}
},
"Properties": {
"ImageId" : "ami-149f7863",
"InstanceType" : "m1.small",
"SecurityGroups" : [ {"Ref" : "WebServerSecurityGroup"} ],
"KeyName" : "MySSHKey",
"UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
"#!/bin/bash -v\n",
… your user data script …
]]}}
}
}
AWS CloudFormation exampleAuto Scaling group definition
"WebServerGroup" : {
"Type" : "AWS::AutoScaling::AutoScalingGroup",
"Properties" : {
"AvailabilityZones" : [
"us-east-1a","us-east-1b","us-east-1c",
],
"LaunchConfigurationName" : { "Ref" : "LaunchConfig" },
"MinSize" : “3",
"MaxSize" : “6",
"DesiredCapacity" : “3",
"LoadBalancerNames" : [ { "Ref" : "ElasticLoadBalancer" } ]
}
}
Align resources with demand
Use Spot Instances
• Price based on supply/demand
• You choose your maximum price/hour
• Your instance is started if the Spot price is lower
• Your instance is terminated if
the Spot price is higher, with 2 minutes notice
• But: You did plan for fault tolerance, didn’t you?
Spot Instance example
On-Demand:
$0.24
$0.028 (11.7%) $0.026 (10,8%)
$3.28
(1367%)
Spot Instance use cases
• Stateless Web/App server fleets
• Amazon EMR
• Continuous Integration (CI)
• High Performance Computing (HPC)
• Grid Computing
• Media Rendering/Transcoding
aws.amazon.com/ec2/spot
Spot Bid Advisor
Spot Instances recap
• Very dynamic pricing
• Opportunity to save 80-90% cost• But there are risks
• Different prices per AZ
• Leverage Auto Scaling!• One group with Spot Instances
• One group with On-Demand
• Get the best of both worlds
• Spot fleets – Manage thousands of
Spot Instances with one API call
“But my applications are
too small
for Auto Scaling!”
Amazon EC2 Container Service
• Easily manage Docker containers
• Flexible container placement
• Designed for use with other AWS services
• Extensible
• Performance at Scale
• Secure
10%
15%
7%
12%
20%
9%
Consolidate with Amazon ECS
App 1 App 2
App 3 App 4
App 5 App 6
6
12 345
Amazon ECS
cluster
AWS Lambda
Amazon S3 bucket events AWS Lambda
Original object Compressed object
1
2
3
Get rid of idle time with AWS Lambda
• Automatic scaling
• Automatic provisioning
• No need to manage infrastructure
• Just bring your code
• $0.20 per million requests, 1M free
• 100 ms payment granularity
• Never pay for idle
Less than 40% utilization?
Consider using AWS Lambda instead!
Optimizing database utilization
Database optimization through caching
DynamoDB Amazon RDS DynamoDB Amazon RDS
ElastiCache
Caching saves money
DynamoDB reads
Saved 3k reads per second (>20k reads per second in total)
Cache really everything!
Cache hit ratio 25-30% Cache hit ratio 89-95%
Without negative caching With negative caching
Hit
Miss
Cache invalidation
“There are only two hard things in Computer Science:
cache invalidation and naming things” -- Phil Karlton
Two ways to cache:
Time to Live (TTL) with invalidation
Keep the cache in sync all the time
Synchronous writes
Sync/check in the app with after-write return values
DynamoDB
ElastiCache
write/update always
write/update always
Uncoupled writes/invalidation
Sync/updates via Lambda (uncoupled)
DynamoDB
ElastiCache
write/update always
no update
LambdaDynamoDB
Stream
update
Think of strategies for optimizing CU use
• Use multiple tables to support varied access patterns
• Understand access patterns for time series data
• Compress large attribute values
Consider read (4K) vs. write (1K) sizes
Use Amazon SQS to buffer over-capacity writes
Resize capacity units dynamically
Amazon SQS can buffer requests
Dynamic DynamoDB
Offload popular traffic to
Amazon S3 and/or Amazon CloudFront
Operational Goals
“Focus on what you do best,
let AWS do the rest.”
Leverage existing services
• Use Amazon RDS, DynamoDB,
ElastiCache for Redis or
Amazon Redshift
• Instead of running your own database
• Amazon Elasticsearch Service• Instead of running your own cluster
• Amazon SQS
• Amazon Kinesis,
Amazon Kinesis Firehose, Amazon SNS, and more …
AWS has experts for each service
RDSAmazon Redshift
Amazon
Elasticsearch
Service
Amazon KinesisSQS
DynamoDB
Pick the right tool for the job
Key/Value
Scalable
throughput
Low latency
Amazon Aurora
More complex
data/queries
Scalable
storage
Amazon
Redshift
Big (complex)
data
Higher
latency
ElastiCache
for Redis
Key/Value
In-Memory
(Very) low
latency
There is not one database to rule them all
MongoDB
Tracking
API
RTB
Engine
User&Stats
API
Tracking
API
RTB
Engine
DynamoDB
Decoupled
Amazon
Aurora
Amazon
Redshift
User&Stats
API
Advantages
• No undifferentiated heavy lifting
(= cost savings)
• AWS operates the DB
infrastructure for us
• Simple and more granular scale out
• No interference between
RTB/Tracking and User/Stats
Let’s recap
1. Use AWS TCO/cost/billing tools
2. Use Reserved Instances
3. Avoid idle instances through automation
4. Use Spot Instances
5. Optimize database utilization
6. Pick the right tool for the job
7. Offload your architecture
But wait! There’s even more!
https://youtu.be/SG1DsYgeGEk
AWS re:Invent 2015
ARC302 Running Lean Architectures: Optimizing for Cost Efficiency
• DynamoDB GSI Tweaking
• Using CloudFront to avoid
multi-region setups
• Amazon S3 storage tiering
• … and much more!
Thank you!
Remember to complete
your evaluations!