kapil thangavelu - cloud custodian

17
Cloud Custodian Fleet Management in AWS

Upload: serverlessconf

Post on 16-Apr-2017

7.760 views

Category:

Internet


5 download

TRANSCRIPT

Page 1: Kapil Thangavelu - Cloud Custodian

Cloud CustodianFleet Management in AWS

Page 3: Kapil Thangavelu - Cloud Custodian

Serverless

BUT

Page 4: Kapil Thangavelu - Cloud Custodian

We still have Servers

Lots

And

Lots

:(

Page 5: Kapil Thangavelu - Cloud Custodian

A sea of policies- fleet wide savings policies

- off hours stops for dev environments- garbage collect ebs, elb, etc- Detect over-provisioned resources

- numerous security policies- Encrypt all the Things- Access Control- ssl ciphers

- numerous compliance policies- tag compliance / chargeback- current images- backups

Source

Page 6: Kapil Thangavelu - Cloud Custodian

Fleet ManagementAcross Lots of federated accounts.

Natural tendency- One off scripts-

But- How are they implemented- How are they deployed- How are they configured- How are they managed- Who owns them

Software Engineering- How are they Tested- Are they Reviewed

Who Knows? Source

Page 7: Kapil Thangavelu - Cloud Custodian

Cloud Custodian•A rules engine for infrastructure management.

•YAML DSL for policies based on query resources or subscribe to events, apply filters, take actions.

Integrated Lambda provisioning and event sources.

•Outputs to Amazon S3, Amazon Cloud Watch Logs, Amazon Cloud Watch Metrics

Opensource @ https://github.com/capitalone/cloud-custodian

- name: require-rds-encrypt-and-non-public resource: rds mode:

- type: cloudtrail- events:

- CreateDBInstance filters:

- or: - Encrypted: false - PubliclyAvailable: true actions:

- type: delete skip-snapshot: true

Page 8: Kapil Thangavelu - Cloud Custodian

Amazon Cloud Watch EventsFeatures

● Powerful infrastructure observation capabilities

● Enables “realtime” rules enforcement and reaction with wide coverage of AWS product APIs.

Sources

● All Cloud Trail Events (P99 @ 90s delivery window as of April 2016)

● EC2 instance state changes (600ms)● ASG instance membership changes

(600ms)● Periodic Scheduling (custom)● Custom events

Page 9: Kapil Thangavelu - Cloud Custodian

Cloud CustodianResource type policies (ec2 instance, ami, auto scale group, bucket, elb, etc).

Filter resources

Invoke actions on filtered set

Output resource json to s3, metrics to cloudwatch

Vocabularies of actions, and filters for policy construction.

- name: ebs-copy-instance-tags resource: ebs filters: - type: value key: "Attachments[0].Device" value: not-null actions: - type: copy-instance-tags tags: - App - Env - Owner - Name

Page 10: Kapil Thangavelu - Cloud Custodian

Filtering resourcesGeneric Value filter

- Jmespath expressions on resource’s json representation

- Lots of operator matching (in, not-in, absent, not-null, gte, regex, etc)

Arbitrary nesting of filters with ‘or’ and ‘and’ blocks.

Simple key/value are equality matches with value expressions

- type: value # Ignore keys that start with # 'aws:' as they don't count towards the limit. Key: "[length(Tags[?!starts_with(Key,'aws:')])][0]" op: less-than value: 10

- or: - “tag:App”: absent - “tag:Env”: absent - and: - Encrypted: false

Page 11: Kapil Thangavelu - Cloud Custodian

Multi Step Workflows

“Poorly tagged instances, should be stopped in 1 day, and then terminated in 3”

- mark-for-op- marked-for-op

Chain together multiple policies.

- name: ec2-tag-compliance-mark resource: ec2 description: | Find all non-compliant tag instances for stoppage in 1 days. mode: type: periodic schedule: rate(1 day) filters: - "tag:maid_status": absent - or: - "tag:App": absent - "tag:Env": absent - "tag:Owner": absent actions: - type: mark-for-op op: stop days: 1

- name: ec2-tag-compliance-stop resource: ec2 description: | Stop poorly tagged and schedule Terminate. mode: type: periodic schedule: rate(1 day) filters: - type: marked-for-op op: stop - or: - "tag:App": absent - "tag:Env": absent - "tag:Owner": absent actions: - stop - type: mark-for-op op: terminate days: 4

Page 12: Kapil Thangavelu - Cloud Custodian

Custodian Vocabulariesasg: actions: - delete - mark-for-op - rename-tag - suspend - tag - remove-tag - resume propagate-tags filters: - vpc-id - time - marked-for-op - not-encrypted - image-age - onhour - tag-count - offhour - launch-config

ec2: actions: - mark-for-op - remove-tag - snapshot - tag - start - tag-trim - stop - terminate filters: - ebs - marked-for-op - ephemeral - image - instance-age - onhour - tag-count - offhour - image-age

s3: actions: - attach-encrypt - remove-statements - encrypt-keys - encryption-policy - delete-global-grants filters: - missing-statement - global-grants - is-log-target - has-statement

Additional resource types

- RDS - ELB - Redshift - CloudFormation - AMI - EBS - EBS Snapshot

Page 13: Kapil Thangavelu - Cloud Custodian

MetricsResource Count

Action Time

Query/Filter Time

Custom

Page 14: Kapil Thangavelu - Cloud Custodian

Example Policy - Amazon S3 EncryptionRequire encryption for objects

name: s3-require-encryptionresource: s3description: | Apply encryption required policy to new bucketsmode: type: cloudtrail events: - CreateBucketactions: - encryption-policy - encrypt-keys

Find elb/s3 logs sinks and switch to lambda encrypt name: s3-remediateresource: s3description: | Encryption required policymode: type: periodic schedule: rate(1 day) filters: - type: is-log-targetactions: - attach-encrypt - type: remove-statements statement_ids: - RequireEncryptedPutObject

Page 15: Kapil Thangavelu - Cloud Custodian
Page 16: Kapil Thangavelu - Cloud Custodian

Roadmap- Elastic search indexing of records / outputs (programmatic dashboards /

historical trending)- Flourish ??- Cross Language support (lambda invoke actions)- Moar filters/actions/resources

https://github.com/capitalone/cloud-custodian/milestones