deep dive into highly available open stack architecture openstack summit vancouver 2015

Post on 21-Apr-2017

9.299 Views

Category:

Engineering

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Arthur Berezin, Sr. Technical Product Manager, Red Hat

Deep Dive into Highly Available OpenStack Architecture

OpenStack SummitVancouver May 2015

Agenda

★ HA Enabling ServicesPacemaker and HAProxy

★ Shared ServicesMariaDB w/Galera, RabbitMQ w/Mirrored Queues

★ OpenStack ServicesKeystone, Nova, Neutron, Glance, Cinder, Horizon

★ TopologiesController, Compute, Network, Storage

cc: Morio2015 Source: https://www.wikiwand.com/en/Scuderia_Ferrari

Losing Your Controller

https://www.youtube.com/watch?v=Kb43Nxuwc4I

High Availability

● Minimize downtime by avoiding SPOF ● Create service redundancy

○ Active-Active When possible■ Stateless services or HA internal support■ Active-Passive if nothing else is applicable

● Scale out Architecture

HA Enabling TechnologiesPacemaker, HAProxy

● Cluster Resource Manager● Uses Corosync for cluster communication● Monitor and Control Resources:

○ Floating Virtual IP Address (VIP)○ SystemD/LSB/OCF Services ○ Cloned Services(Active/Active)

● STONITH - Fencing with Power Management○ Important for ensuring data consistency

Pacemaker

● Virtual IP(VIP)● SystemD Cloned Resource● STONITH Fencing

Pacemaker OpenStack Service

Node 2 - 192.168.1.2Node 1 - 192.168.1.1

pcsd pcsd

Cloned

STONITH STONITH

Service Service

ServiceVirtual IP10.0.0.1

HAProxy Load Balancer

Load Balancing and Proxy for HTTP/TCP● Mature and popular with web applications● Health Checking ● Load Distribution

● Load Distribution○ Round Robin, ○ Stick-Table

● API Isolation● Failure Detection

Node 1

Node 2 Node 3

HAProxy Load Balancer

Service Service

HAProxy

Avoiding SPOFsA day in a Highly Available Service Life

Horizon Controller

Give Me Horizon Web UI NOW!

Horizon Controller

Give Me Horizon Web UI NOW!

Single Point Of Failure

Horizon Controller 1

Horizon Controller 2

Horizon Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Horizon Controller 1

Horizon Controller 2

Horizon Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Single Point Of Failure

Each Could Fail

Horizon Controller 1

Horizon Controller 2

Horizon Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Single Point Of Failure

Pacemaker Cloned Horizon Service

Horizon Controller 1

Horizon Controller 2

Horizon Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

HAProxy Controller 3

HAProxy Controller 2

Pacemaker Cloned Horizon Service

Pacemaker Cloned HAProxy Service

Pacemaker Cloned HAProxy Service

Horizon Controller 1

Horizon Controller 2

Horizon Controller 3

HAProxy Controller 1

HAProxy Controller 3

HAProxy Controller 2

Give Me Horizon Web UI NOW!

Horizon

VIP

Pacemaker Cloned Horizon Service

Shared ComponentsDatabase, Messaging

Galera with MariaDB

● Active-Active MultiMaster Synchronous Replication

● Auto Node Joining● Row level parallel replication● Native with MariaDB

DB Node 3DB Node 2DB Node 1 GALERA REPLICATION

wsrep

MariaDB

wsrep wsrep

MariaDBMariaDB

RabbitMQ Clustering with Mirrored Queues

RabbitMQ Node1RabbitMQ Node1RabbitMQ Node1

RabbitMQ

RabbitMQ Clustering

RabbitMQ RabbitMQ Mirrored Queue

OpenStack Services Keystone, Glance, Cinder, Nova, Neutron, Horizon

Keystone

Keystone

HTTPD

SQL: Assignments SQL: Identities LDAP: Identities

API Call

Keystone

Service:★ httpd/Keystone

○ API○ Assignments ○ Identities

■ LDAP■ SQL

● Cloned Stateless HTTPD Service

● Same SSL Certs on all nodes

● Cache is local on each host

Node 2Node 1

Cloned

Keystone

SQL: Assignments SQL: Identities LDAP: Identities

API Call

ClonedHTTPd/Keystone

HTTPd/Keystone

HAProxy HAProxy

pcsd pcsd

Keystone

VIP

STONITH STONITH

● Cloned Stateless HTTPD Service

● Same SSL Certs on all nodes

● Cache is local on each host

Node 2Node 1

Cloned

Keystone

SQL: Assignments SQL: Identities LDAP: Identities

API Call

ClonedHTTPd/Keystone

HTTPd/Keystone

HAProxy HAProxy

pcsd pcsd

Keystone

VIP

STONITH STONITH

Glance

Glance

SQLStorage

Glance-API Glance Registry

Service:★ Glance-API

○ API○ Storage Calls

★ Glance-Registry○ Keeps images

registry at the Database

CeilometerNotifications

HTTP

RabbitMQ

● Both services areCloned Active/Active

● Both services are LB and VIP

Node 2Node 1

Cloned

Glance

SQLImages Store

HAProxy HAProxy

pcsd pcsd

GlanceRegistryVIP

ClonedGlance-API Glance-API

ClonedGlanceRegistry

GlanceAPIVIP

GlanceRegistry

STONITH STONITH

● Both services areCloned Active/Active

● Both services are LB and VIP

Node 2Node 1

Cloned

Glance

SQLImages Store

HAProxy HAProxy

pcsd pcsd

GlanceRegistryVIP

ClonedGlance-API Glance-API

ClonedGlanceRegistry

GlanceAPIVIP

GlanceRegistry

STONITH STONITH

● Both services areCloned Active/Active

● Both services are LB and VIP

Node 2Node 1

Cloned

Glance

SQLImages Store

HAProxy HAProxy

pcsd pcsd

GlanceRegistryVIP

ClonedGlance-API Glance-API

ClonedGlanceRegistry

GlanceAPIVIP

GlanceRegistry

STONITH STONITH

Cinder

CinderVolume

★ Cinder-API○ API

★ Cinder-Scheduler○ Volumes placement

★ Cinder-Volume○ Manages Storage

★ Cinder-Backup

SQL

Storage

Cinder-API

CinderScheduler

RabbitMQ

CinderBackup

Storage

Cinder

Driver

VMData Path

● Cinder-API isStateless Cloned

● LB and VIP● Cinder-Volume is A/P

due it potential races

● Cinder-Backup is A/PNode 2Node 1

A/PVolume Volume

Cloned

Cinder

Storage

HAProxy HAProxy

pcsd pcsd

ClonedCinder-API Cinder-API

ClonedScheduler Scheduler

CinderAPIVIP

DriverDriver

STONITH STONITH

Nova

Nova

NovaCompute

★ Nova-API○ API

★ Nova-Scheduler○ VM placement

★ Nova-Conductor○ Updates DB on

Compute’s behalf ★ Nova-Compute

○ Runs VM Instances

SQL

Nova-API

NovaScheduler

RabbitMQ

NovaConductor

libvirt/KVM

VMVM

Compute Controller Services

Nova

NovaCompute

★ Nova-API○ API

★ Nova-Scheduler○ VM placement

★ Nova-Conductor○ Updates DB on

Compute’s behalf ★ Nova-Compute

○ Runs VM Instances

SQL

Nova-API

NovaScheduler

RabbitMQ

NovaConductor

libvirt/KVM

VMVM

Controller Services● Nova-API configured

with LB and VIP● Nova-API,

Nova-Scheduler and Nova-Conductor are Stateless A/A Cloned services

Node 2Node 1

ClonedConductor Conductor

ClonedHAProxy HAProxy

pcsd pcsd

ClonedNova-API Nova-API

ClonedScheduler Scheduler

Nova-APIVIP

Nova

SQL RabbitMQ

STONITH STONITH

Compute Service● Each host is independent● Nova-compute watched locally

by SystemD● VM HA not supported(yet),

Probably Liberty

Nova

Compute2

NovaCompute

libvirt/KVM

VM

Compute1

VMVM

NovaCompute

libvirt/KVM

Compute Service● Probably supported in Liberty● Each host is independent● Nova-compute watched locally

by SystemD● Liberty Blueprint: Mark Host Down

Nova VM HA

Compute1

VMVM

NovaCompute

libvirt/KVM

STONITH

pacemaker_remote

Compute1

VMVM

NovaCompute

libvirt/KVM

STONITH

pacemaker_remote

Neutron

★ Neutron Server○ API and Management

★ Neutron L2 Agent○ L2 Traffic on compute

★ Neutron L3 Agent○ Network Routing

★ DHCP Agent★ LBaaS Agent

Neutron

SQL

NeutronServer

L2 Agent(s)Open vSwitch

RabbitMQ

L3 Agent

DHCP Agent

LBaaS Agent

Controller

NeutronNetworkNode

Compute2

SQL Internet

Compute1L2 Agent(s)Open vSwitch

L2 Agent(s)Open vSwitch

VMVMVMVMVM VM

L3 Agent DHCP Agent

L2 Agent(s)Open vSwitch

RabbitMQ

LBaaS Agent

NeutronServer

R1

NetworkNode2

NetworkNode1

A/P

Cloned

Cloned+

VRRP

L2 Agent

L3 Agent

LBaaS Agent

RPC

DHCP Agent

Compute1

L2 Agent

L3 Agent

Controller1

LBaaS Agent

DHCP Agent

R1pcsd

Neutron APIVIP

Controller2pcsd

Cloned

Cloned

HAProxy

Neutron-API Neutron-API

HAProxy

R1

Neutron

● Kilo○ L3 Agent HA with VRRP○ DHCP Agent HA

● Liberty■ L3 Agent - DVR■ DVR + VRRP

Longer Term■ Distributed DHCP on compute nodes

Horizon

Service:★ httpd/OpenStack-

Dashboard○ Django web app○ Uses services APIs

Horizon Browser

Horizon

CinderAPI

Neutron API

GlanceAPI

Keystone API

NovaAPI

Horizon

● Cloned Stateless HTTPd Service

● Same SSL Certs on all nodes

● Cache is local on each host

Node 2Node 1

Cloned

ClonedHTTPd/Horizon

HTTPd/Horzon

HAProxy HAProxy

pcsd pcsd

Horizon

VIP

STONITH STONITH

TopologiesController, Compute,Network, Storage

Active - Active Controller Cluster

Controller 1 Controller 2 Controller 3

HAProxy

Packemaker

Keystone

Neutron

Cinder

...

HAProxy

Packemaker

Keystone

Neutron

Cinder

...

HAProxy

Packemaker

Keystone

Neutron

Cinder

....

Galera Multi-master replication

MariaDB MariaDB MariaDB

RabbitMQ Mirrored Queues

Controller Cluster

Compute2Compute1

Nova Compute Nova Compute

Controller 1 Controller 2 Controller 3Controller Services

PublicTenantManagement

Controller Services Controller Services

Controller Services

Controller Services

Controller Services

Controller Cluster

Compute2Compute1

Nova Compute Nova Compute

Network Cluster

PublicTenantManagement

Neutron NetworkNode1

Neutron NetworkNode2

Neutron NetworkNode3

Controller 1 Controller 2 Controller 3

Controller Cluster

Compute2Compute1

Nova Compute Nova Compute

Storage Cluster

StorageManagement

Controller 1 Controller 2 Controller 3 CinderGlanceNode1

CinderGlanceNode2

CinderGlanceNode3

Volume Storage Image Store

Resources

Resources RDO HA Ref Arch

https://github.com/beekhof/osp-ha-deploy

Layer 3 High Availability - VRRP DVR DHCP

http://assafmuller.com/2014/08/16/layer-3-high-availability/

DVR

http://assafmuller.com/2015/04/15/distributed-virtual-routing-overview-and-eastwest-routing/

Creating a Highly Available Red Hat OpenStack Platform Configuration (OSP5 and RHEL 7)

https://access.redhat.com/articles/1150463

About High Availability with OpenStack Platform

https://access.redhat.com/articles/1274203

New nova API call to mark nova-compute down

https://review.openstack.org/#/c/169836/

The Different Facets of OpenStack HA

http://blog.russellbryant.net/2015/03/10/the-different-facets-of-openstack-ha/

Implementation of Pacemaker Managed OpenStack VM Recovery

http://blog.russellbryant.net/2015/04/08/implementation-of-pacemaker-managed-openstack-vm-recovery/

HA Talks during Summit

HA Infrastructure Talks

Pacemaker: OpenStack’s PID 1

MariaDB Galera cluster : Best practices

High Availability Architecture

Deep Dive Into a Highly Available OpenStack Architecture

Real World Practices

Highly Available OpenStack: From Theory to Reality

Lessons learned on upgrades: the importance of HA and

automation

Providing OpenStack Service High-Availability Through

Anycast Routing

HA Storage Talks

Keeping OpenStack storage trendy with Ceph and containers

DRBD9 for OpenStack

The Road to Enterprise-Ready OpenStack Storage as Service

Dude, where is my volume

HA Networking Talks

Highly Available, Performant, VXLAN Service Node

IPv6 impact on Neutron L3 High Availability

High Availability and Resiliency Testing Strategies for OpenStack

Clouds

Thank You

top related