aws meetup auto scaling deep dive
TRANSCRIPT
AWS MEETUP AUTO SCALING DEEP DIVE
What is a scalable service? ● Increasingresourcesresultsinapropor0onalincreaseinperformance● Faulttolerance● Capableofhandlingheterogeneity● Amor0zedopera0onalcostdiminishesasscaleincreases
Efficient Scaling Thinkca=le,notpets.--CERN
What is AWS Auto Scaling? • Groups:EC2instancesareorganizedinto
groupssothattheycanbetreatedasalogicalunitforthepurposesofscalingandmanagement.Whenyoucreateagroup,youcanspecifyitsminimum,maximum,and,desirednumberofEC2instances
• LaunchConfigura1ons:TemplateforagroupsEC2instances.AMI,Securitygroup,devicemappings,IAMpolicy,etc.
• Scalingpolicies:Planthatdictatesthepoliciesusedtodrivescalingac0ons.
What can I do with AWS Auto Scaling?
● Scale-inandoutbasedonmanualinterven0on
● Scale-inandoutbasedonmetrics/alarms
● Scale-inandoutbasedonschedules● Auto-healfailingnodestomaintaincapacity
● Reusetemplate-basedgroupconfigura0ons
Elastic & self-healing
What should I scale on? Auto-scalingyourWeb/RESTAPI0ers:● NumberofHTTPrequests(outstandingorrate)● NetworkI/O● Response0me(averageorgeometricmean)● Concurrentconnec0oncountAuto-scalingyourworker/batch0ers:● Queuedepth● AverageaggregateCPU● Schedule-based● CustomCloudWatchmetrics
What should I scale on? Scheduledscalingac0ons:• Guaranteedorderofexecu0onforscheduledac0onswithinthesamegroup,but
notforscheduledac0onsacrossgroups• Generallyexecuteswithinsecondsbutmaybedelayedforuptotwominutes
Cancreateamaximumof125scheduledac0onspermonthperAutoScalinggroup.(4xperdayfora31-daymonth,perAutoScalinggroup)
• Musthaveaunique0mevalue• Cooldownperiodsarenotsupported• Canbeone-0me-onlyorrecurring
aws autoscaling put-scheduled-update-group-action --scheduled-action-name ScaleOut --auto-scaling-group-name foo --start-time "2016-05-12T09:00:00Z" --desired-capacity 3 aws autoscaling put-scheduled-update-group-action --scheduled-action-name ScaleIn --auto-scaling-group-name foo --start-time "2013-05-13T17:00:00Z" --desired-capacity 1 aws autoscaling put-scheduled-update-group-action --scheduled-action-name nighttime --auto-scaling-group-name foo --recurrence "0 17 * * *" --desired-capacity 1
Multi-input Scaling Policies ● Scaleout:
● +2instancesif>50visibleSQSmessagesfor>5min● +50%if>1000visibleSQSmessagesfor>2min● +fixed100instancesif>10000visibleSQSmessagesfor>1min
● Scalein:● -10instancesif0visibleSQSmessagesfor>15min● -25%if0visibleSQSmessagesfor>30min
Termination: Custom Policies • OldestInstance-theoldestinstanceinthegroup
• NewestInstance-thenewestinstanceinthegroup
• OldestLaunchConfigura0on-instancesthathavetheoldestlaunchconfig
• ClosestToNextInstanceHour-instancesthatareclosesttothenextbillinghour
Termination: Default Policy
Termination: Instance protection • Protectapar0cularinstance(ormul0ple)frombeingterminatedviascaling-in.• Canbeenabledonawholegrouporindividualinstance.• StartsoncetheinstancestateisInService• DetachinganinstancefromtheASGclearsthisseang• IfallinstancesinanASGareprotectedfromtermina0onandscale-inhappens,
thedesiredinstancecountisdecremented.• Doesnotprotectfromtermina0onviatheEC2TerminateInstancesAPI
Auto Scaling Cooldown • Preventsthelaunchingortermina0onofaddi0onalinstancesbeforethe
previousscalingac0vitytakeseffect• Doesn’tapplytomanualscalingbydefault• Doesn’taffectthereplacementofunhealthyinstances• Defaultvalueof5minutes• Cancreatescaling-policyspecificcooldowns• AppliesaderaninstancemovesoutofWaitState• Periodbeginsforspotwhenanybidissuccessful
Auto Scaling Health Checks AutoScalingperiodicallyperformshealthchecksontheinstancesinyourAutoScalinggroupandiden0fiesanyinstancesthatareunhealthy.Healthstatusisdeterminedby:• EC2statuschecks(instanceStatus != running || systemStatus == impaired) • ELBhealthchecks• Customhealthchecks (aws autoscaling set-instance-health …)
SetyourHealthCheckGracePeriodappropriatelybasedonexpectedlaunch0me!
Advanced use: SNS AutoScalingsupportsthesendingofAmazonSNSno0fica0onswhenthefollowingeventsoccur:• Successfulinstancelaunch• Failedinstancelaunch• Successfulinstancetermina0on• FailedinstanceSNSsupportsdeliverytoLambda,SQS,HTTP,Email,andSMS.
Advanced use : Lifecycle Hooks Lifecyclehookssupportcustomac0onswhentheASGlaunchesandterminatesinstances.InstancescanstayinWaitstateforupto48hours.Cooldown&HealthCheckGracePeriodapplyuponenteringInService.Ac0onscanprovideeither:ABANDONorCONTINUE
Auto Scaling Debugging StandbyStateoffersameanstoremoveinstancesfromthegroup(andELB)formaintenance/debugging.aws autoscaling enter-standby --instance-ids i-5b73d709 --auto-scaling-group-name my-asg --should-decrement-desired-capacity
Everything Fails
"Everythingfails,allthe0me"WernerVogels,CTOAmazon.com
Designforfailure!
Simian Army • ChaosMonkey• ChaosGorilla• ChaosKong• ConformityMonkey• JanitorMonkey
Continuous delivery AutomateeverythingpossibleInfrastructureascodeImmutablesystemsRollingdeployments,Red-black
Continuous delivery
Twi=er: @gnethercu= @IN_Intelligence
OpenSource: github.com/mypurecloud