open stack summit – hong kong - 2013 openstack ha @paypal

17
Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

Upload: jimena-cadle

Post on 15-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

Open Stack Summit – Hong Kong - 2013

OPENSTACK HA @PAYPAL

Page 2: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

2

PayPal offers flexible and innovative payment solutions for consumers and merchants of all sizes.

• 137,000,000 users

• $300,000 payments processedeach minute

• 193 markets / 26 currencies

• The World’s Most Widely Used Digital Wallet

ABOUT PAYPAL

Page 3: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

3

Why HA is important for PayPal?

Our Learning

Our Solution

What is not solved?

Q&A

AGENDA

Page 4: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

4

WHY HA IS IMPORTANT?

“no perceived downtime” for cloud users

Enterprise Class

Auto Scaling & Flex up/down can never break

API Integrations always succeed

Everyone expected to use the cloud

Page 5: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

5

No SPOF “Under the Cloud”

Scale Across the Data Center(s)

Scale Across Racks & Containers

Respect natural availability zones within the data centers

No ‘cloud’ can impact any other ‘cloud’

AVAILABILITY REQUIREMENTS

Page 6: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

6

INFRASTRUCTURE RACK

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active10

g A

ctive

10g

Pass

ive

1g

Mgm

t 1g M

gmt

10g Passive

10g A

ctive

LB Active LB PassiveAccess

Compute Racks … Infrastructure / Controller Racks

Layer 2 versus Layer 3

Cattle&

Puppies

Page 7: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

7

INFRASTRUCTURE RACK

OpenStack Services are all VM on KVM

Every infra component resides on 2+ nodes

Redundant physical racks

Redundant power/switches in each rack

Layer-3 connectivity between racks (no Layer 2)

Enterprise Grade Physical LB (floating VIP)

Page 8: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

8

COMPUTE

LB Active LB Passive

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

LB Active LB PassiveAccess

1

2

3

Page 9: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

9

COMPUTE

Active Passive

10g 10g

1g

10g 10g

1g

10g 10g 10g 10g

Management1g 1g

bond0 bond0HyperscaleRaid-10

HyperscaleRaid-10

Top Of Rack Top Of Rack

Page 10: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

OPENSTACK SERVICES

Page 11: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

11

LB VIP for every service (unless it can’t)

Connect to LB VIP, not individual nodes

Script to close Server Connections

Pacemaker only works inside a single Layer-2 (not a large enterprise)

Auto Restart using Monit

MySQL

Swift Cluster

OPENSTACK CONSIDERATIONS

Palanisamy, Anand
Page 12: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

12

HEAT with Corosync/Pacemaker/keepalived (for now)

KeyStone / Nova / Glance / Swift Proxy

Rabbit MQ Cluster

Cinder Volume Service

CONTINUED…

Palanisamy, Anand
Page 13: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

13

Figure shows a typical interaction between Cinder components to serve a end user request. (create new volume in this example).

CINDER SERVICES WORKFLOW

Cinder API

Cinder Scheduler

Cinder Volume

AMPQ

Storage Back-end1

Storage Back-end2

User request(create volume)

1

23

4

5

6

Page 14: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

14

How HA is implemented for Cinder Components:

• API (stateless) – Load Balancer (A/A or A/P);

• Scheduler (stateless) – Pacemaker, Queue itself (A/A or A/P);

• Volume – Pacemaker, Queue itself (A/A or A/P).

CINDER SERVICES WITH HA

Cinder API A Cinder Scheduler B

Cinder Volume A

AMPQCluster

Storage Back-end1

Storage Back-end2

User request(create volume)

1

2

5

6

Load Balancer

Cinder API B

3

Cinder Scheduler A

4

Cinder Volume B

Page 15: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

15

VIP-friendly Cinder Volume service

Seamless Upgrade Flip

Failed DB TX Reconciliation

Consistent API Response Time

UNRESOLVED

Palanisamy, Anand
Page 16: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

16 Confidential and Proprietary

[email protected]

Page 17: Open Stack Summit – Hong Kong - 2013 OPENSTACK HA @PAYPAL

THANK YOU

HTTP://GITHUB.COM/PAYPAL/AURORA

SCOTT CARLSON - @RELAXED137RAJ GEDAZHITENG HUANG IRC:WINSTON-D