chef cookbooks for openstack ha

Adam SpiersSenior Software Engineer

[email protected]

SUSE® OpenStack CloudChef cookbooks for HATechnical overviewfor curious upstream #openstack-chef developers

mailto:[email protected]

2

Agenda

These slides were extracted from internal HA training for SUSE OpenStack Cloud developers, and slightly modified for the benefit of the upstream #openstack chef‑ community.

• barclamp-pacemaker• Synchronization

• Maintenance mode

• HA-enabled barclampsTip: some handy

hyperlinks in this deck!

3

barclamp-pacemaker

• SUSE OpenStack Cloud uses the Crowbar deployment framework, which is extensible via plugins which are called “barclamps”

• The core of the HA functionality is provided via the Pacemaker barclamp, which:‒ exposes cluster membership/configuration options via Crowbar UI

‒ sets up the bare cluster and related components

‒ provides Chef cookbooks so other barclamps (Keystone, Glance etc.) can make their own services HA

• This barclamp is mature, heavily tested, and deployed in many production OpenStack clouds around the world.

https://github.com/crowbar/barclamp-pacemaker

barclamp-pacemakerinternals

5

corosync cookbook

• Completely independent of Crowbar‒ TODO: desperately needs to be upstreamed

• Under chef/cookbooks/corosync/

• Configures /etc/corosync/‒ including authkey generation / propagation

‒ Founder node generates it

‒ Other nodes get a copy

• Contains fail-safe cluster startup logic (e.g. to prevent STONITH loops)

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/corosync

6

pacemaker cookbook

• The heart of the barclamp!

• Under chef/cookbooks/pacemaker/• Completely independent of Crowbar

‒ TODO: upstreaming desperately needs to be finished!

‒ already used git subtree to export subdirectory to https://github.com/stackforge/cookbook-pacemaker

‒ need to document properly

‒ need to set up Travis CI

‒ automate propagation of changes between repos via ci.opensuse.org Jenkins instance?

• Depends on corosync cookbook

• Important code, so let's look inside ...

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/pacemaker

https://github.com/stackforge/cookbook-pacemaker

7

pacemaker cookbook internals

Two parallel sets of code:

1. Pacemaker::CIBObject class hierarchy

● Takes care of communicating with Pacemaker via crm(8)

2. LWRPs for cluster resources● Makes it really easy to write recipes which create / manage

cluster resources● Back-end provider uses Pacemaker::CIBObject class

hierarchy

Both sets of code have comprehensive unit test suites!

8

Pacemaker::CIBObject hierarchy

• Class hierarchy under libraries/pacemaker*• Independent of Chef

‒ TODO: should be spun out into a separate gem!

• Pacemaker::CIBObject‒ Pacemaker::Resource

‒ Pacemaker::Resource::Primitive

‒ Pacemaker::Resource::Clone etc.

‒ Pacemaker::Constraint‒ Pacemaker::Constraint::Location

‒ Pacemaker::Constraint::Order etc.

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/pacemaker/libraries

9

LWRPs for cluster resources

• Under resources/ and providers/

• pacemaker_primitive, pacemaker_clone etc.

• Has to re-use code via mixins, because LWRPs don't support inheritance :-/

• With hindsight, should have used https://github.com/poise/poise or at least written as a HWRP :-/

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/pacemaker/resources

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/pacemaker/providers

https://github.com/poise/poise

10

Example usage of LWRPs

service_name = "keystone"

pacemaker_primitive service_name do agent node[:keystone][:ha][:agent] # "lsb:openstack-keystone" # If we used the OCF RA instead of the LSB init script: # params ({ # "os_auth_url" => node[:keystone][:api][:admin_auth_URL], # "os_tenant_name" => monitor_creds[:tenant], # "os_username" => monitor_creds[:username], # "os_password" => monitor_creds[:password], # "user" => node[:keystone][:user] # }) op node[:keystone][:ha][:op] # { :monitor => { :interval => “10s” } } action :createend

pacemaker_clone "cl-#{service_name}" do rsc service_name action [:create, :start]end

11

Cluster nodeCluster nodeCluster nodes

SUSE Cloud HA architecture

chef-client HA recipepacemaker_primitive “keystone”

Admin server

Crowbar

LWRPs+ mixins

Pacemaker::CIBObjectPacemapublic_method1()public_method2()public_method3()

Pacemaker::Constraint

#running?#crm_start_command()#crm_stop_command()

Pacemaker::CIBObject

#parse_definition#configure_command#delete_command

Pacemaker::Resource

#running?#crm_start_command()#crm_stop_command()

crm(8)

crmdCorosync / OpenAISCIBXML

12

crowbar-pacemaker cookbook

• Crowbar-specific code

• Under chef/cookbooks/crowbar-pacemaker/

• LWRPs (under resources/ and providers/)

‒ service (covered next)

‒ sync_mark (more detail later)

‒ drbd and drbd_create_internal

• Recipes:

‒ maintenance-mode (more detail later)

‒ apache, drbd, haproxy, stonith

• Libraries‒ Various helpers (more detail later)

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/crowbar-pacemaker/

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/crowbar-pacemaker/resources/

https://github.com/crowbar/barclamp-pacemaker/tree/master/chef/cookbooks/crowbar-pacemaker/providers/

13

Chef::Provider::CrowbarPacemakerService

• Alternative provider for HA-enabled service resources

• Ensures that all service management operations (start, stop, restart, reload) are handled safely with respect to Pacemaker

• Was really hard to get this right!!‒ 119 lines of comments for 92 lines of code

• Despite complexity, goal was ease of use

https://github.com/crowbar/barclamp-pacemaker/blob/master/chef/cookbooks/crowbar-pacemaker/providers/service.rb

14

Using C::P::CrowbarPacemakerService

service "keystone" do service_name node[:keystone][:service_name] supports :status => true, :start => true, \ :restart => true action [ :enable, :start ] ... if ha_enabled provider Chef::Provider::CrowbarPacemakerService endend

15

C::P::CrowbarPacemakerService implementation

• start / stop‒ always ignored (handled by pacemaker_* LWRP)

• enable / disable‒ both always translate to disable

• reload‒ proxied to original service resource iff service is running

• restart‒ puts node in maintenance mode then restarts

16

Maintenance mode

• Goal: make it safe to restart a service on a single node without confusing the whole cluster

• Pacemaker provides per-node maintenance mode for exactly this‒ (not to be confused with per-resource maintenance mode, which is

completely different)

• Degrades cluster‒ need to minimise time spent in maintenance mode

• Multiple resources within one chef-client run might need maintenance mode‒ but don't want mode to flip-flop a lot

17

How does maintenance mode work?

• JIT approach:

‒ Switch to maintenance mode first time it's needed within the chef-client run

‒ Switch out at end of run

• Need to handle case where node was already placed in maintenance mode prior to beginning of run (e.g. manually by cloud operator)

• Handlers in /etc/chef/client.rb‒ pacemaker_start_handler

‒ pacemaker_report_handler

‒ pacemaker_exception_handler

‒ /var/chef/handlers/pacemaker_maintenance_handlers.rb

• libraries/maintenance_mode_helpers.rb

18

barclamp-pacemaker: other cookbooks

• Under chef/cookbooks/:

‒ drbd

‒ lvm

‒ haproxy

‒ hawk

• Fairly self-explanatory

Synchronization

20

Cluster-wide synchronization ‒ the problem

Why is synchronization needed?

Example 1:

• Keystone proposal is applied, with keystone-server role assigned to cluster.

• All nodes start running chef-client more or less in parallel

• Necessary keystone rpms get installed

• Two or more nodes could reach keystone database resource block at more or less the same time

• action :create only creates if it doesn't exist

• Potential race where >= 2 nodes test for existence before any node creates it

• >= 2 nodes attempt to create database at the same time

21


One will lose the race ...

22


Example 2:

• Continuation of scenario from example 1

• keystone::server recipe configures keystone.conf etc.

• then invokes crm configure to add keystone service to cluster.

• Pacemaker starts keystone service ...

• ... but it could start on any node!

• ... even a node which hasn't yet finished installing / configuring keystone!

23


Founder node initiated failure on non-founder node

24


Turns out we need two types of synchronization:

1. “Founder goes first”

Ensure one node in cluster (the founder) enters and completes a critical section of a recipe(e.g. "create database") before any other nodes can enter it.

2. “Wait for all nodes”

Ensure all nodes reach the same point ("keystone installed, configured, and ready to start anywhere") before any can proceed further.

25

Cluster-wide synchronization ‒ how to use

Type 1: “founder goes first”

crowbar_pacemaker_sync_mark "wait-keystone_database"...# Create the Keystone database (critical section)...crowbar_pacemaker_sync_mark "create-keystone_database"

N.B. the cluster founder gets to perform the critical section before any other node, but every node still performs the critical section, which needs to be idempotent.

What if we only want one node to perform the critical section?

26


execute "keystone-manage db_sync" do command "keystone-manage db_sync" user node[:keystone][:user] group node[:keystone][:group] action :run # We only do the sync the 1st time, and only if # we're not doing HA or if we are the founder of # the HA cluster (so that it's really only done once). only_if { !node[:keystone][:db_synced] && (!ha_enabled || CrowbarPacemakerHelper.is_cluster_founder?(node)) }end

27


Type 2: “wait for all nodes”

# Wait for all nodes to reach this point so we know # that all nodes will have all the required packages # installed before we create the pacemaker resources.crowbar_pacemaker_sync_mark "sync-keystone_before_ha"

28

Cluster-wide synchronization ‒ result

All nodes functioning harmoniously

29

Cluster-wide synchronization ‒ internals

How does it work?

• Hopefully you don't need to know‒ It should Just Work™

• Chef node attributes used as synchronization “marks”

• See libraries/synchronization.rb for details

• Value defaults to crowbar-revision from proposal

‒ Assumes cookbook name == barclamp name

https://github.com/crowbar/barclamp-pacemaker/blob/master/chef/cookbooks/crowbar-pacemaker/libraries/synchronization.rb

HA-enabled barclamps

31

Patterns for HA-enabled barclamps

HA code in recipes often interleaved with non-HA code:

• Ugly if ha_enabled conditionals

• Synchronization points

• Incompatible with using upstream cookbooks

• but we don't have anything better yet :-/

• Possible solution: split cookbooks into chunks at synchronization points‒ but would still require intrusive upstream changes

32

Patterns for HA-enabled barclamps

Interim solution: minimise ugliness!

• Split HA code into separate recipes where possible

if ha_enabled include_recipe "keystone::ha"end

• Use helpers

my_admin_host = CrowbarHelper.get_host_for_admin_url(node, ha_enabled)my_public_host = CrowbarHelper.get_host_for_public_url(node, node[:keystone][:api][:protocol] == "https", ha_enabled)

• Use custom provider for service resources

if ha_enabled provider Chef::Provider::CrowbarPacemakerServiceend

33

Questions?

• I lurk on the Freenode #openstack-chef IRC channel, nick aspiers

• I also lurk on the Chef OpenStack google group, but am not currently doing a good job at monitoring traffic

• Feel free to mail me at <[email protected]>

mailto:[email protected]

Corporate HeadquartersMaxfeldstrasse 590409 NurembergGermany

+49 911 740 53 0 (Worldwide)www.suse.com

Join us on:www.opensuse.org

34

http://www.opensuse.org/

chef cookbooks for openstack ha

Technology