dynamic infrastructure and the cloud
TRANSCRIPT
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lee Atchison ∙ Senior Director Strategic ArchitectureNew Relic, Inc.
Sydney, Australia
Dynamic Infrastructure and The Cloud Adventures in Keeping Your Application Running…at Scale
leeatchison@leeatchison
Who am I?
Lee Atchison
30 years in industry
5 in New Relic(Architect Lead, Cloud, Service Migration)
7 in Amazon Retail & AWS(Built First AppStore, AWS Elastic Beanstalk)
Who Specialize in:Cloud computingServices & MicroservicesScalability, Availability
leeatchison@leeatchison
Senior Director Strategic Architecture
The conversation…
“We were wondering how changing a setting on
our MySQL database might impact our performance…
The conversation…
“We were wondering how changing a setting on
our MySQL database might impact our performance…
… but we were worried that the change may cause our production
database to fail…”
The “scary” overheard conversation…
“… Since we didn’t want to bring down production, we decided to make the
change to our backup (replica) database instead…
UnderConstruction
… but we were worried that the change may cause our production
database to fail…”
The “scary” overheard conversation…
“… Since we didn’t want to bring down production, we decided to make the
change to our backup (replica, hot standby) database instead…
… After all, it wasn’t being used for anything
at the moment.”
UnderConstruction
The “ scary” overheard conversation…
Until, of course, the backup was needed…
This was a true story
UnderConstruction!!!!X
X
18Confidential ©2008-15 New Relic, Inc. All rights reserved.
300ms
1.5s18Confidential ©2008-15 New Relic, Inc. All rights reserved.
19Confidential ©2008-15 New Relic, Inc. All rights reserved. 19Confidential ©2008-15 New Relic, Inc. All rights reserved.
.9s
20Confidential ©2008-15 New Relic, Inc. All rights reserved. 20Confidential ©2008-15 New Relic, Inc. All rights reserved.
21
The Data from Monitoring Your AppDwarfs the Data Inside the App
Confidential ©2008-15 New Relic, Inc. All rights reserved.
22Confidential ©2008-15 New Relic, Inc. All rights reserved.
User Experience
Business Outcome
Servers
AppsBig Data Problem
Need Data at Every Level
Amazon EC2 Instance
BrowserMobile
Server (Virtual) Hardware
Server OS
Application & Application Microservices
Typical Server / Amazon EC2 Instance• Application & Application
Microservices• Server OS• Hardware (virtual)
Amazon EC2 Instance
BrowserMobile
Server (Virtual) Hardware
Server OS
Application & Application Microservices
Low Level Monitoring
Amazon CloudWatch
AWS CONSOLE
Amazon CloudWatch
Monitors• EC2 instance• Virtualization• Hardware• [CPU / Disk / Networking]
Doesn’t know about:• Server OS• Memory / Filesystem• Processes• Configuration• Application
- Latency- Error rates
Amazon EC2 Instance
BrowserMobile
Server (Virtual) Hardware
Server OS
Application & Application Microservices
DASHBOARDS
Infrastructure / Application Monitoring
New RelicApplicationMonitoring
New Relic Infrastructure
Monitoring
Amazon CloudWatch
AWS CONSOLE
Monitors (Server):• How O.S. is performing• Configuration Changes• Processes• Hardware
Monitors (Application):• App health• App performance• Microservices
Doesn’t know• Virtualization
Amazon EC2 Instance
BrowserMobile
Server (Virtual) Hardware
Server OS
Application & Application Microservices
Full Stack Monitoring
New RelicApplicationMonitoring
New Relic Infrastructure
Monitoring
Amazon CloudWatch
AWS CONSOLE
Integrations
New Relic Monitors
CloudWatch monitors
DASHBOARDS
AWS / CloudWatch• Visibility into virtualization• CPU / Disk / Networking• 14 AWS Services
APM• CPU / Disk / Networking• Memory / Filesystem• Processes- Infrastructure components- Configuration inventory• Application / Microservices:
- Latency- Error rates- App insights
30
Success in Software Analytics
Confidential ©2008-15 New Relic, Inc. All rights reserved.
Application Performance
Customer Experience
BusinessOutcome
30Confidential ©2008-15 New Relic, Inc. All rights reserved.
32
Keeping Your App Running…At Scale
Availability…
…is more than you think it is.
Dynamic Cloud…
...make availability happen.
The Cloud Can Help
Better Data Center
Dynamic Environment
How do we use the cloud to accomplish this?
Cloud as a “Better Data Center”
Resources are allocated to uses, just like in a data
center
Provisioning process is faster
Lifetime of components is relatively long
Capacity planning is still important and
still applies
Why use a “Better Data Center”?
Add new Capacity(faster)
Improve Application Availability(redundancy)
Compliance
Cloud as a “Dynamic Tool for Dynamic Apps”
Use Only the Resources you need
Allocate / de-allocateresources on the fly
Resource allocation is an integral part of your
application architecture
Dynamic Cloud
Resources are: Application in charge:
Allocated Application is aware of and is controlling traditional OPs resources
Consumed De-allocated
Dynamic Usage Example…Docker Container Age
(by Minute and Hour)
1,200,00011% under one minute
Container age (minutes)
Dynamic Cloud Technologies
Dynamic Cloud is about scaling and availability
EC2 Auto Scaling
Mobile / IoT Dynamic routing
Load balancing
Queues and notifications
Docker
Dynamic Cloud Enables Better Applications Faster
Traditional Data Center Cloud Data Center Dynamic Cloud
Good Better Best
The way you’ve done things in the past won’t work in the future.
Dynamic Cloud
Server running application/ processes
Process running a command
Function performing a task or operation
EC2 Docker Lambda
Things happen faster because of…
Microcomputing & AWS Lambda
• Highly dynamic
• Incredibly scalable
• No infrastructure to provision
• Massively shared infrastructure
Also known as:• Functions as a Service (FaaS)• Compute as a Service (CaaS)• Serverless
AWS Lambda
S3Bucket
DynamoDB
APIGateway
SQS
RESOURCESSOME
S3Bucket
APIGateway SQS
RESOURCESSOME
• Takes an event from an AWS resource (A Trigger)
AWS Lambda
S3Bucket
DynamoDB
APIGateway
SQS
RESOURCESSOME
S3Bucket
APIGateway SQS
RESOURCESSOME
LambdaScript
LambdaInstances
• Takes an event from an AWS resource (A Trigger)
• Creates an instance to execute
AWS Lambda
S3Bucket
DynamoDB
APIGateway
SQS
RESOURCESSOME
S3Bucket
APIGateway SQS
RESOURCESSOME
LambdaScript
LambdaInstances
• Takes an event from an AWS resource (A Trigger)
• Creates an instance to execute
• Can impact original or different AWS Resource
AWS Lambda
S3Bucket
DynamoDB
APIGateway
SQS
RESOURCESSOME
S3Bucket
APIGateway SQS
RESOURCESSOME
LambdaScript
LambdaInstances
• Takes an event from an AWS resource (A Trigger)
• Creates an instance to execute
• Can impact original or different AWS Resource
• Any number of instances can run at a time
Dynamic Cloud has unique monitoring requirements…
How do I track what the dynamic cloud is doing for me (or to me)?
What is a Dynamic Cloud Application?
• Application & Application MicroservicesResponsible for the parts you care about
• Infrastructure• Allocation/Provisioning• Scaling
Let cloud manage rest
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
Monitoring Dynamic Cloud Applications
AWS CONSOLE
CloudWatch
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
AWS InfrastructureApplication Performance
CloudWatch
AWS CONSOLE
New RelicApplicationMonitoring
New Relic Infrastructure
Monitoring
DASHBOARDS
Integrations
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
CloudWatch
AWS CONSOLE
New RelicApplicationMonitoring
New Relic Infrastructure
Monitoring
DASHBOARDS
AWS InfrastructureApplication Performance
New Relic Monitors
CloudWatch & AWS monitors
Integrations
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
How do you monitor this?
?How do you
monitor this?
Where did it go? It was just here!!
The thing you monitored 10 minutes ago…
...doesn’t exist anymore!?
Monitoring the Dynamic Cloud
Monitor the Cloud Components themselves Monitor the lifecycle of the Cloud Components
Very different than monitoring traditional Data Center components
Changing World
Dev
Now - DYNAMIC World
Ops
• We know:• Change is inevitable
• We must:• Embrace and drive change
• Enabling:• Quicker growth• More reliable growth
62
Keeping Your App Running…At Scale
Dynamic Cloud…
...make availability happen.
Migration…
...how do I get my app to the cloud?
High Expectations
Blame Game Intensity Rises
“The problem must be the cloud’s fault”
Pressure to declare victory in the migration
The Politics of Migration
Show me the new apps!!?
Promised Performance gains?Cost controls?Optimize costs?
Why is it taking so long?
Migration failure…
OpsUse the Cloud
• Move in a controlled way• Learn as you go• Measure everything
Does not have to be painful…
Experiment
Secure the Cloud
Enable Servers, Enable SaaS
Enable Value-Added Services
Enable Unique Services
Mandate Cloud Usage
Progressions in Cloud Adoption…The Controlled Way
Standard stepsmost companies
follow
Enterprise IT Cloud Adoption Strategy
Experiment
Non-evasive, safe technologies - S3
- Perhaps: CloudFront, SQS, SES Stay away from EC2/Servers Security: Easy as one-offs No “Policies” implemented yet “Just seeing what this is all about”
Progressions in Cloud Adoption
What is this cloud thing?
Progressions in Cloud Adoption
Enterprise IT Cloud Adoption Strategy
Secure the Cloud
IAM (Credentials)
VPC (Secure network)
AWS Direct Connect (just another data center)
Cloud policies begin to be formed All parts of the company are now involved Critical evolution point
Can we trust the cloud?
Progressions in Cloud Adoption
Enterprise IT Cloud Adoption Strategy
Enable Servers, Enable SaaS
EC2 - Basic “data center migration”
- Just another server type available… Multiple AZs/Regions - Part of multi-datacenter resiliency strategy Independently: SaaS usage increases - Non-critical or internal uses first
The cloud seems to work pretty well…
Experiment
Secure the Cloud
Enable Servers, Enable SaaS
Enable Value-Added Services
Progressions in Cloud Adoption
Progressions in Cloud Adoption
Enterprise IT Cloud Adoption Strategy
Enable Value-Added Services
Managed Databases - RDS, Aurora Other Managed Services - Elastic Beanstalk, SES, SQS, ElasticSearch
Dynamic Cloud becomes a thing…
Experiment
Secure the Cloud
Enable Servers, Enable SaaS
Enable Value-Added Services
Enable Unique Services
Progressions in Cloud Adoption
Progressions in Cloud Adoption
Enterprise IT Cloud Adoption Strategy
Enable Unique Services
High value, Cloud-specific services - Lambda, Kinesis
- DynamoDB
- SWF, Elastic Transcoder
- Redshift Point of commitment... ...dependent on cloud
Dynamic Cloud is deeply ingrained…
Experiment
Secure the Cloud
Enable Servers, Enable SaaS
Enable Value-Added Services
Enable Unique Services
Mandate Cloud Usage
Progressions in Cloud Adoption
Progressions in Cloud Adoption
Enterprise IT Cloud Adoption Strategy
Mandate Cloud Usage
Cloud as a data center replacement Company is now “all in” with cloud Netflix…
Why do we need our own data centers?
What is the cloud?
Can we trust the cloud?
The cloud works pretty well…
Dynamic Cloud becomes a thing…
Dynamic Cloud is deeply ingrained…
Why do we need our own data centers?
Progressions in Cloud AdoptionThe steps aren’t easy…
Experiment
Secure the Cloud
Enable Servers, Enable SaaS
Enable Value-Added Services
Enable Unique Services
Mandate Cloud Usage
Progressions in Cloud Adoption
Different CompaniesDifferent SpeedDifferent Needs
Cloud Adoption Strategies
Enterprise ITCloud Adoption Strategy
Experiment
Secure the Cloud
Enable Servers, Enable SaaS
Enable Value-Added Services
Enable Unique Services
Mandate Cloud Usage
ApplicationCloud Adoption Strategy
Experiment/Peripherial Usage
Cloud Servers
Managed Components
Unique Components
Application Cloud Committed
ApplicationAdoption
CorporateAdoption Cloud Adoption
Mandate
Committed
Allow Value-Added
Allow SaaS
Allow Servers
Secure
Experiment
Experiment Servers ManagedComponents
UniqueComponents
Committed
CriticalApplications
NewApplications
Non-Critical/Internal
ApplicationsStep #1
Step #2
Step #4
First Steps
ApplicationRe-Writes
Step #3
IAMVPC
Non-IntegralSaaS
EC2
IntegralSaaS
RDSSES
LambdaKinesis
ApplicationAdoption
CorporateAdoption Cloud Adoption
Mandate
Committed
Allow Value-Added
Allow SaaS
Allow Servers
Secure
Experiment
Experiment Servers ManagedComponents
UniqueComponents
Committed
CriticalApplications
NewApplications
Non-Critical/Internal
ApplicationsStep #1
Step #2
Step #4
First Steps
ApplicationRe-Writes
Step #3
S3
AdoptionSweet Spot
First Steps
ApplicationAdoption
CorporateAdoption
Mandate
Committed
Allow Value-Added
Allow SaaS
Allow Servers
Secure
Experiment
Experiment Servers ManagedComponents
UniqueComponents
Committed
Cloud AdoptionCenter of Gravity
IntegralSaaS
RDSSES
LambdaKinesis
AdoptionSweet Spot
ApplicationAdoption
CorporateAdoption
Mandate
Committed
Allow Value-Added
Allow SaaS
Allow Servers
Secure
Experiment
Experiment Servers ManagedComponents
UniqueComponents
Committed
S3
EC2
Cloud AdoptionCenter of GravityIAM
VPC
Non-IntegralSaaS
Adoption Success Strategies
Understand where your
culture is
Consciously plan your acceptance
Drive your cultural change to your
desired level
Monitor your adoption
Understand your needs
Monitor Your Adoption
Before Migration
Baseline application(servers, databases, caches, applications,
microservices)
Determine your steady state
Monitor Your Adoption
During Migration
Incorporate cloud’s internal monitoring
Continue application monitoring
Understand and solve all deviations from steady state…
The Biggest Role Monitoring Plays In Migration
Performance Post Migration & During Optimization
Pre-migration Feasibility & Benchmarking
Continue Monitoring…
Infrastructure is now out of your
control
Some cloud specific concerns (EC2
instance failures, instance degradation)
Dynamic Technologies Impact Our Applications
Understand application
impact
Ongoing application & infrastructure monitoring is
essential
Monitor Your Adoption
919191919191
Fairfax Media Limited is a leading multi platform media company in Australasia, reaching 10.6 million Australians and 2.9 million New Zealanders.
Media/Entertainment
“Because we monitored our on-premises systems with New Relic before we migrated them to Amazon Web Services, we were able to identify potential issues and fix them during the migration process.”
- Cheesun ChoongHead of Product Platforms
Results Reduced
diagnosis time from hours to
minutes
Migrated to AWS with confidence
Identified underutilized
servers to save money
92
Keeping Your App Running…At Scale
Dynamic Cloud…
...make availability happen.
Migration…
...how do I get my app to the cloud?
Availability…
…is more than you think it is.
Monitor your application and infrastructure
Monitoring just the server
EC2 Instance
Server OS
Server (Virtual)Hardware
Application &Application Microservices
AWS CONSOLE
CloudWatch
Worked when rate of change was low…
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
Full Stack Monitoring
New RelicApplicationMonitoring
New Relic Infrastructure
Monitoring
DASHBOARDS
• Top to bottom monitoring…• Full stack accountability...• Dynamic infrastructure control...
You need:
Digital Fan Experience for Major League Baseball
New Relic empowers our developers to experiment and work fast without compromising on the quality of the MLB fan experience. – Sean Curtis
Senior Vice President of Engineering
Change is speeding up
Traditional Data Center Cloud Data Center Dynamic Cloud
Dynamic Cloud enables better applications faster.
Good Better Best
The way you’ve done things in the past won’t work in the future.
Server OS
Server (Virtual)Hardware
Application & Application
Microservices
Provisioning
Application & Application
Microservices
Application & Application
Microservices
BrowserMobile
Full Stack Monitoring
New RelicApplicationMonitoring
New Relic Infrastructure
Monitoring
DASHBOARDS
Thank youLee Atchison ∙ Senior Director Strategic Architecture
New Relic
Architecting for ScaleBy: Lee AtchisonPublished by: O’Reilly Mediawww.architectingforscale.com
leeatchison@leeatchison
This document and the information herein (including any information that may be incorporated by reference) is provided for informational purposes only and should not be construed as an offer, commitment, promise or obligation on behalf of New Relic, Inc. (“New Relic”) to sell securities or deliver any product, material, code, functionality, or other feature. Any information provided hereby is proprietary to New Relic and may not be replicated or disclosed without New Relic’s express written permission.
Such information may contain forward-looking statements within the meaning of federal securities laws. Any statement that is not a historical fact or refers to expectations, projections, future plans, objectives, estimates, goals, or other characterizations of future events is a forward-looking statement. These forward-looking statements can often be identified as such because the context of the statement will include words such as “believes,” “anticipates,”, “expects” or words of similar import.
Actual results may differ materially from those expressed in these forward-looking statements, which speak only as of the date hereof, and are subject to change at any time without notice. Existing and prospective investors, customers and other third parties transacting business with New Relic are cautioned not to place undue reliance on this forward-looking information. The achievement or success of the matters covered by such forward-looking statements are based on New Relic’s current assumptions, expectations, and beliefs and are subject to substantial risks, uncertainties, assumptions, and changes in circumstances that may cause the actual results, performance, or achievements to differ materially from those expressed or implied in any forward-looking statement. Further information on factors that could affect such forward-looking statements is included in the filings we make with the SEC from time to time. Copies of these documents may be obtained by visiting New Relic’s Investor Relations website at http://ir.newrelic.com or the SEC’s website at www.sec.gov.
New Relic assumes no obligation and does not intend to update these forward-looking statements, except as required by law. New Relic makes no warranties, expressed or implied, in this document or otherwise, with respect to the information provided.
Safe Harbor