cloud foundry summit 2015: building a robust cloud foundry (ha, security and dr)

49

Upload: pivotal

Post on 25-Jul-2015

378 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)
Page 2: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Building a Robust Cloud FoundryHA, Security and DR

Haydon Ryan | Duncan Winn

Page 3: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

This Talk

• High Availability (HA)

• Security

• Backing Up to Mitigate Disasters

Page 4: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

© Copyright 2014 Pivotal. All rights reserved.© Copyright 2014 Pivotal. All rights reserved.

HA

Page 5: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

High Availability FocusKeep apps and services running in a performant, reliable and recoverable manner with timely error detection

1. Application Instances

2. Platform Processes

3. Platform VMs

4. Availability Zones

Keep Cloud Foundry running in a performant, reliable and recoverable manner with timely error detection

Page 6: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

HA Deployments

Data Center Data Center

vs

Single Foundation Deployment

Dual Foundation Deployment

Data Center

AZ AZ

RDS

Page 7: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

WHAT IF I TOLD YOU

IT’S POSSIBLE TO SANELY STREACH LAYER 2

Page 8: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

User Targets myapp.mycf.com

DNS Resolution

NSX Boundary NSX Boundary

VIP VIPSSL Termination

SSL Termination

DNS Global Traffic Management (GTM)

HA ProxyHA Proxy

LTM ApplianceLTM Appliance

HA ProxyHA Proxy

LTM Appliance LTM Appliance

Page 9: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

DomainsSystem Application

myapp.mycf.comtargetsClient

cf1.comcf push myappDeveloperapi.runtime-cf1.comcf apiDeveloper

CF1

cf2.comcf push myappDeveloperapi.runtime-cf2.comcf apiDeveloper

CF2

myapp.mycf.comtargetsClientmyapp.mycf.comtargetsClientmyapp.mycf.comtargetsClient

Page 10: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Services

Page 11: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

ServicesAppApp

Page 12: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

ServicesService Service

AppApp

Page 13: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Services

Page 14: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

HA Deployments

Data Center Data Center

vs

Single Foundation Deployment

Dual Foundation Deployment

Data Center

AZ AZ

RDS

Page 15: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Customer Requirements

• AWS with One VPC • Specific IP Ranges • Using their internal corporate DNS • no ELBs or Route 53 due to security setup • Multiple Deployments of Cloud Foundry

• Availability Requirements: • App uptime • Failure matrix for downtime situations 15

Page 16: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

16

HA Proxy HA Proxy

Bind DNS

CF Router CF Router

HA Proxy HA ProxySSL Termination

Page 17: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Who does the deployment need to be highly available for?

• Users

17

• Developers • Operations

Page 18: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Any non-critical jobs?• clock_global

• used to clean up cc jobs. • Rely on Resurrector? • Redeploy to a different AZ by changing

the resource_pool

18

Page 19: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Critical Jobs & VMs• haproxy • router • nats • cloud controller • uaa/login? • doppler?

19

Page 20: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Any less-critical jobs?• loggregator / doppler • loggregator traffic controller • etcd

• Jumpbox? • bosh?

20

Page 21: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Caveats with this design• Single points of failure?

• DNS • Bosh • Jumpbox

• Human interaction required in outage • Bind DNS does not do health monitoring.

Monitoring scripts were outside the scope of the engagement. 21

Page 22: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

22

AZ 2 Private Subnet

Customer Managed

Interstate Data Center

VPC10.202.64.0/19

AZ 1 Private Subnet Bosh Subnet

jumpbox

CF SG

Direct connect

Bosh SG

login

uaa

bosh

router

dea cc

natshealth etcd

doppler

cc worker

loggregator traffic

controller

clockRDS Subnet

RDS SG

boshdb

uaadb

ccdb

apps manager

router

bind dns

Customer Managed

NAT

bastion

ha Proxy

ha Proxy

ha Proxy

ha Proxy

router

router

login

uaadea cc

natshealth etcd

doppler

cc worker

loggregator traffic

controller

AZ 1

AZ 2

Page 23: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

How We Deployed Services

• Proxy is a Single Point of Failure

• No Load Balancer to use • Acceptable by customer in

failure matrix 23

Proxy Server

Server

App

Proxy

Proxy

Page 24: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Best Practices for Services

24

• By Default the service binding uses the first proxy address only

Proxy

Proxy Server

Server

Server

App

Load  Balancer

Page 25: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Which Deployment

25

Data Center Data Center

Dual Foundation Deployment

Single Foundation Dual AZs

Data Center

Single Foundation Single DC

Data Center

AZ AZ

RDS

Page 26: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

© Copyright 2014 Pivotal. All rights reserved.© Copyright 2014 Pivotal. All rights reserved.

Security and Networking (on AWS)

Page 27: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Security• Security is Hard • Three main concepts

• Restrict • Limit scope if Compromised • Mitigate

• Feedback Loop

Page 28: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Restrict Users• Individual Multi Factor Authentication

• IaaS Console/Hypervisor • Jumpbox

• Separate accounts • jumpbox • bosh • github

28

Page 29: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Restrict Packets• IaaS

• Security Groups (Instance Level) (better) • ACLs (Subnet Level) • Routes

29

Page 30: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Restrict Containers• Cloud Foundry

• Application Security Groups • dea network properties

• (allow_networks, deny_networks)

30

Page 31: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Pivotal Cloud Foundry for AWS 1.4

31

VPC10.0.0.0/16

RDS Subnet

Private Subnet

Public Subnet

Ops Manager

Elastic Runtime SG

ELB

Internet Gateway

NAT SG

Ops Manager SG

RDS SG

login

uaa micro

router

vpcall

NAT

restricted ip80, 443, 22*

dea

Common traffic flow

sg allow rules

cc

natshealth etcd

doppler

cc worker

loggregator traffic

controller

clock

boshdbuaadb ccdbapps

manager db

autoscaling

ELB SG

80?,443

vpcall

vpcall

was it just DEAs that used NAT?

Page 32: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Limit Scope if Compromised• Different user/pass for each component

• Strong passwords (and usernames) • 20 Characters Long • RANDOM • Both Cases • best avoid special characters • eg: YxLIodYrUBQJrvMRYSQL

• Avoid cloud cow 32http://vanmethod.deviantart.com/art/Purple-­‐Cow-­‐on-­‐a-­‐Cloud-­‐146265642

Page 33: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Limit Scope if Compromised

33

Runner

UAA

Login

uaadb

mySql App  Data

Page 34: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Post Breach Security Measures• Roll

• AWS Credentials • Username and password (Manifest) • PEMs

• Investigate: • Vm Logs (stored in Splunk / CloudWatch Logs) • Bosh and Login Audit Trail • Isolate the VM for investigation

• Resurrector will resurrect a non compromised VM • Feedback:

• Incident Reports and Management Support 34

Page 35: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Paranoid Level Security for AWS• Cloudtrail

• Alerts • Audit Logs • Rollback’

• Remove ability to delete • s3 buckets • subnets / vpc • backups

• Everything else can be recovered from a backup… 35

Page 36: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

© Copyright 2014 Pivotal. All rights reserved.© Copyright 2014 Pivotal. All rights reserved.

Disaster Recovery

Page 37: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Backing Up Cloud Foundry

Configuration

CCDB UAADB Apps Man DB BOSH DB

BlobstoreNFS Server

Page 38: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

SCENARIO ONELOSE PCF OPS-MGR

ORCF DEPLOYMENT

Page 39: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Restoring Ops Manager

Export Configuration

Create New Ops Manager

Import Configuration

Page 40: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

ConfigurationBackup Ops Manager

scp ubuntu@<OPS MRG HOST>:/var/tempest/workspaces/default/deployments/*yml .Backup Deployment Manifests

Page 41: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Deployment Manifests in BOSH

~$ bosh deployments

bosh download manifest cf-c700aee17d9f801eb152 cfmanifest.yml

Page 42: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

SCENARIO TWOLOSE BOSH

Page 43: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Restoring Bosh With PCF

Export Configuration Import

Configuration:/var/tempest/workspaces/default/deployments/micro

BOSH  Director

+ bosh.yml

Page 44: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Restoring Bosh Manually

BOSH

BOSH DB

bosh.yml

pg_dump /var/vcap/store

/dev/xvda /dev/sdb /dev/sdf

Volume:

BOSH DB

External MySQL

Blobstore

Page 45: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Critical DatabasesBackup Cloud Controller DB Encryption Credentials

Locate Databases Info From Deployment Manifestbosh download manifest cf-c700aee17d9f801eb152 cfmanifest.yml

Page 46: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

NFS / Blobstore✦ Managing Access with ACLs

✦ Create Group Bucket Policy for “Deny DeleteBucket”

✦ Turn on versioning { "Version": "2012-10-17", "Statement": [ { "Effect": "Deny", "Action": [ "s3:DeleteBucket", "s3:DeleteObjectVersion" ], "Resource": [ "*" ] } ] }

Page 47: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

© Copyright 2014 Pivotal. All rights reserved.© Copyright 2014 Pivotal. All rights reserved.

Takeaway

Page 48: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

Takeaways✦ Tradeoffs: No “One Size Fits All”

✦ Service Layer

✦ Existing: Environmental Security and Networking Constraints

✦ Backup: Configuration, Databases, Blobstore (This is your CF).

Page 49: Cloud Foundry Summit 2015: Building a Robust Cloud Foundry (HA, Security and DR)

KEEPCALM

AND

CF PUSH