aws re:invent 2016: amazon cloudfront flash talks: best practices on configuring, securing and...
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Best Practices for Configuring, Securing, and
Monitoring Your Amazon CloudFront Distribution
Alec Peterson
General Manager, Amazon CloudFront
Anton Radlein
Software Development Manager, Amazon CloudFront
Cherie Wong
Sr. Software Development Manager, Amazon CloudFront
Efrain Fuentes
Enterprise Solutions Architect
CTD301
What to Expect from the Session
• How Amazon CloudFront delivers content
• Configuring your cache on CloudFront
• Measure application performance with real user
monitoring (RUM)
• Stop malicious viewers with CloudFront and AWS WAF
How CloudFront delivers
content
Definitions
• Viewer
• An end-user requesting content from CloudFront
• On a mobile device, desktop or other internet-connected
device
• CloudFront POP
• Point Of Presence, also referred to as an Edge Location
• Located in datacenters in major metropolitan areas, directly
connected to multiple ISPs
• Several racks of servers and network equipment, terminating
viewer connections
CloudFront delivering content
• Multiple identical (more or less) locations
• Location selection is critical
• Viewer perspective
• Latency
• Throughput
• CloudFront perspective
• Availability
• Capacity
• Location
What does ‘routing’ actually mean?
• Packet routing
• Purely destination-
based
• Limited ability to route
around congestion
What does ‘routing’ actually mean?
• Request routing
• Latency
• Throughput
• Capacity
• Geography
• Done at the DNS layer (or higher)
How does CloudFront perform routing?
CloudFront
edge
location
1.1.1.1
ISP NS
cloudfront.net
authoritative NS
viewer
(recursive lookup) distribution-id.cloudfront.net
(IP address of optimal CloudFront Edge location) 1.1.1.1
Primarily at the DNS layer• Recursive resolver IP routing
What’s wrong with this picture?
What happened?
• A divergent resolver
• Resolvers that serve a wide set of users across many
networks/geographies
• VPN users
• Distributed corporate networks
• What can be done?
• Use a local resolver
• Use a resolver that supports EDNS0 ECS
What is EDNS0 client-subnet (ECS)?
• IETF open internet-draft
• Informational RFC 7871
• DNS query includes information about the network that
originated the query:
• First three octets of a IPv4 address commonly used
(1.2.3.0/24)
• No client-side resolver modifications necessary
• Some common open resolvers (such as Google’s 8.8.8.8
anycast resolver) support it
EDNS0 ECS-enabled DNS resolution
Key takeaways
• Where you are routed depends on many factors
• Network
• Geographic Location
• Individual POP status
• DNS is an imperfect request routing mechanism
• But it is also ubiquitous
• If your customers use ECS-enabled resolvers, their
experience will improve
Configuring your cache on
CloudFront
Why cache?
Two Laws:
1. Better performance for your viewers.
2. Less load on your origin.
What to expect
• What we do with a viewer request?
• How do we cache?
• Generating cache keys
• Managing your cache
• Setting Cache-Control headers
• Configuring your distribution and cache behaviors
• Additional Best Practices
• Versioning your assets
• Forwarding only required values
• Monitor your logs
Origin
IAD Edge Cache
IAD12 ATL50 JFK1 JAX1
NRT Edge Cache
NRT12 NRT53 NRT52 NRT20
Caching tiers
What happens with each request?
Is it in
cache?
Is it
expired?
Revalidate
with Origin
Origin
responds
with 304 (Not
Modified)
Origin
responds
with 200
(OK) and
latest version
of object
Forward
request to
origin
Y Y
NN
Viewer
Request
Hit / Refresh Hit
Miss
Cache
it
How do we generate a cache key?
Use the host header to create an internal canonical URL.
E.g., d123.cloudfront.net, example.com
Then…
- Remove query strings
- Remove the protocol
- Add accept-encoding (i.e., gzip, identity)
Managing your cache from your origin
Expires headers from origin
Expires reflects when the cache must go back to the origin
server to see if the object has changed.
It is a fixed point in time and accuracy relies on clock
synchronization.
< Expires: Fri, 1 Dec 2017 12:34:50 GMT
Cache-Control headers from origin
These directives give you fine-grained control over what is cached and
for how long (in seconds):
< Cache-Control: max-age=300< Cache-Control: max-age=30, s-maxage=3600
Example: max-age=0, s-maxage=86400 for display ads
Browser
Shared Edge Cache
Cache-Control headers (examples)
Static Assets Login Landing
Pages
Live Streaming
Manifests
Media Fragments
*.css, *.js, images,
software
downloads
index.html /*.m3u8 /*.ts
Cache-Control:
public; max-age=31536000
Cache-Control: no-cache=Set-
Cookie; max-age=30
Cache-Control:
public; max-age=2
Cache-Control:
public; max-age=31536000
Dynamic content? Cache it.
Use Cache-Control directives to minimize load on your origin:
- no-cache: cache & ask origin
- max-age=0: cache & ask origin
Other options:
- no-store: never cached at the edge nor by the browser
- private: never cached at the edge, but might be cached
by the browser
Managing your cache from CloudFront
Cache behaviors
on CloudFront
Specify caching configurations
based on URL path matching
(i.e., for different content).
Whatever you forward affects
your cache key. Use Trusted
Advisor checks!
Be wary of:
• Forwarded headers
• Query string forwarding
• Cookie forwarding
Set Min, Max, and Default TTLs for CloudFront
Min TTL Max TTLmax-age /
Expires
Browser Edge Cache
max-age /
s-maxage /
Expires
Max TTLmax-age /
Expiresmax-age /
s-maxage /
Expires
Min TTL
Max TTLmax-age /
s-maxage /
Expires
Min TTLmax-age /
Expires
A couple tips…
Errors? Cache them too!
Cache and return a custom error
page and response code for each
HTTP error code.
Give your origin just the right
amount of time to recover.
Enable faster iteration of new styles without issuing invalidations.
Protect against browsers that don’t honor your Cache-Control headers.
<linkhref="//assets.example.com/assets/v1/css/jumbotron-narrow.css“rel="stylesheet">
<linkhref="//assets.example.com/assets/v2/css/jumbotron-narrow.css“rel="stylesheet">
<linkhref="//assets.example.com/assets/css/jumbotron-
narrow.css?<md5sum>“rel="stylesheet">
Version your assets
Minimize forwarded values
All forwarded headers are
used as part of the cache
key, which means it
dramatically reduces your
cacheability.
When in doubt, check the logs!
#Version: 1.0 #Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version 2014-05-23 01:13:11 FRA2 182 192.0.2.10 GET d111111abcdef8.cloudfront.net /view/my/file.html 200 www.displaymyfiles.com Mozilla/4.0%20(compatible;%20MSIE%205.0b1;%20Mac_PowerPC) - zip=98101 RefreshHit MRVMF7KydIvxMWfJIglgwHQwZsbG2IhRJ07sn9AkKUFSHS9EXAMPLE== d111111abcdef8.cloudfront.net http - 0.001 - - - RefreshHit HTTP/1.1 2014-05-23 01:13:12 LAX1 2390282 192.0.2.202 GET d111111abcdef8.cloudfront.net /soundtrack/happy.mp3 304 www.unknownsingers.com Mozilla/4.0%20(compatible;%20MSIE%207.0;%20Windows%20NT%205.1) a=b&c=d zip=50158 Hit xGN7KWpVEmB9Dp7ctcVFQC4E-nrcOcEKS3QyAez--06dV7TEXAMPLE== d111111abcdef8.cloudfront.net http - 0.002 - - - Hit HTTP/1.1
Log CloudFront request IDs
Nginx:
log_format main '$remote_addr - $remote_user [$time_local] "$request" ‘ '$status $body_bytes_sent "$http_referer" ‘ '"$http_user_agent" http_x_forwarded_for" "$http_x_amz_cf_id"';
Apache:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{X-Amz-Cf-Id}i\"" combined
Key takeaways
• Set Cache-Control headers appropriately for your
content
• Cache dynamic content
• Create multiple cache behaviors and adapt
configurations for your content type, including errors
• Forward only required values
• Version your assets
• Log your request IDs!
Measure application
performance with RUM
Measure application performance with RUM
Synthetic monitoring vs. real user monitoring (RUM):
• Synthetic monitoring overview
• RUM overview
• When to use one over the other (baselining vs. gaining
situational insight)
What is synthetic monitoring?
Pros:
• Consistent signal of service health
• Easy to setup (kind of)
• Baseline performance
synthetic monitoring
configuration
synthetic
monitoring
portal
web application
simulated users
What is synthetic monitoring?
Pros:
• Consistent signal of service health
• Easy to setup (kind of)
• Baseline performance
synthetic monitoring
configuration
web application
simulated users
Where synthetic measurements go wrong
Cons:
• Network path to your application might not be representative
• Special cases and snowflakes
synthetic monitoring
configuration
web application
simulated usersreal
user
Where synthetic measurements go wrong
Cons:
• Network path to your application might not be representative
• Special cases and snowflakes
synthetic monitoring
configuration
web application
simulated usersreal
user
How do you feel about RUM?
web application
real users
script injected in
web page HTTP
response
RUM
provider
portal
• Script injected in web page
• Script beacons data back from the user’s browser session to the
RUM provider
• RUM provider portal aggregates the data for analysis
What can RUM tell you?
• What should my next optimization be?
• What is the cause of a loss of availability?
*Reference: https://developers.google.com
Network optimizations: connections
Connection definitions:
• Queueing – Time spent waiting to begin processing
• Stalled/Blocking – Total time spent in queue or proxying
• DNS lookup – Time taken to receive DNS records (like A or
AAAA)
• Initial connection – Inclusive of TCP handshake and negotiating
SSL
Network optimizations: requests
Request definitions:
• Request sent – HTTP request sent time
• TTFB - Time To first byte
• Content download – Time to last byte
Network optimizations: head of line blocking
Serialized requests could be your bottleneck due to head of line blocking in
HTTP 1.1 if you’re serving from the same origin!
Network optimizations: Key takeaways
Insights from this example:
• Evaluate your user-base
• Know your data
• Look at the right data
Optimizations:
• Use CloudFront!
• Origin as close to your end-users as possible (multi-region)
• HTTP/2
Best practices for configuring RUM on CloudFront
• Availability: Test your critical resources
• Index pages
• Video manifests
• Critical resources required for page load
• Performance: Capture Total Load time
• First-Byte latency is not always important. Know your content
and optimize on the appropriate dimension!
Stop malicious viewers with
CloudFront and AWS WAF
Securing your CloudFront distribution
• Leverage AWS WAF with preconfigured protections
• Configure CloudFront to serve private content
• Automate security response by using services like AWS
Lambda
• Leverage AWS Certificate Manager for SSL
AWS WAF
AWS WAF preconfigured protections
AWS WAF preconfigured protections
Access Handler
AWS WAF preconfigured protections
Log Parser
AWS WAF preconfigured protections
IP List Parser
AWS WAF preconfigured protections
http://docs.aws.amazon.com/solutions/latest/aws-waf-security-automations/
Private content – restrict origin access
Amazon S3
Origin Access Identify (OAI)• Prevents direct access to your Amazon
S3 bucket
• Ensures performance benefits to all
customers
Custom origin
Block by IP address• Whitelist only the Amazon CloudFront
IP range
• Protects origin from overload
• Ensures performance benefits to all
customers
Signed URLs
• Add signature to the Querystring in
URL
• Your URL changes
• Use to restrict access to individual
files
Signed Cookies
• Add signature to a cookie
• Your URL does not change
• Use to restrict access to multiple
files
Private content – signed URLs and cookies
Automate security response
• Subscribe to Amazon SNS notifications for changes to
IP ranges
• Automatically update security groups
AWS Lambda
Amazon CloudFront
Amazon SNS
Security group
Web app
server
Web app
serverAWS IP ranges
Update IP rangeSNS message
https://github.com/awslabs/aws-cloudfront-samples
Leverage AWS Certificate Manager for SSL
Key takeaways
• Leverage AWS WAF
• Secure your origin and content
• Automate security response
Thank you!
Remember to complete
your evaluations!