easypostgresqlclusteringwithpatroni antsaasma 21.02 · whyishatricky i...

28
Easy PostgreSQL Clustering with Patroni Ants Aasma 21.02.2017 Ants Aasma 21.02.2017

Upload: lynhi

Post on 11-Nov-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Easy PostgreSQL Clustering with Patroni

Ants Aasma21.02.2017

Ants Aasma21.02.2017

Page 2: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Introduction

Ants Aasma21.02.2017

Page 3: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

About me

I Support engineer at CybertecI Helping others run PostgreSQL for 5 years.I Helping myself run PostgreSQL since 7.4 days.

Ants Aasma21.02.2017

Page 4: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

What are we going to talk about

I Patroni - a tool to build high availability clusters withPostgreSQL

I https://github.com/zalando/patroni

I Questions welcome during the talk

Ants Aasma21.02.2017

Page 5: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

PostgreSQL clustering state of the art

I pgpool2, Pacemaker, repmgr, . . .

Ants Aasma21.02.2017

Page 6: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

What Patroni does

Ants Aasma21.02.2017

Page 7: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Parts of the HA problem

I Detect failureI Promote new masterI Route clients to the correct master

Ants Aasma21.02.2017

Page 8: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Why is HA tricky

I PostgreSQL provides single-master replicationI Having more than one master is worse than having none.I Need to agree on who gets to be master

I When servers failI When networks failI When things almost work but not quite

Ants Aasma21.02.2017

Page 9: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Distributed databases to the rescue

I Distributed consensus algorithms were inveneted to solve this.Paxos, Raft

I Many existing distirbuted consensus databases:I etcdI ConsulI Zookeeper

Ants Aasma21.02.2017

Page 10: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

How Patroni solves the HA problem

I Each node runs a Patroni agentI Patroni agent runs a constant loop to checkI health of local PostgreSQLI health of clusterI fix things when they are not idealI A distributed consensus store is used to pick a leader

Ants Aasma21.02.2017

Page 11: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

How Patroni works

I Picks one node to initialize the databaseI Clones new nodes joining the cluster using pg_basebackupI Monitors health of PostgreSQLI Promotes a new master if existing master failsI Sets primary_conninfo on other standbysI Rejoins old master using pg_rewind

Ants Aasma21.02.2017

Page 12: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Load balancer

I Patroni does not do client routing or virtual IP movementI libpq expects to see a single host to connect to.

I This will be fixed in PostgreSQL 10I JDBC already supports this directly

I Use your favourite load balancer to perform the connectionforwarding

I HAProxy/nginx + VIP failover/run on app servers and connectto localhost

I F5 BigIP, etc.I Your cloud providers load balancerI Customize your connection pooler

Ants Aasma21.02.2017

Page 13: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Patroni architecture

Figure 1: Patroni overviewAnts Aasma21.02.2017

Page 14: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Demo time

Ants Aasma21.02.2017

Page 15: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Local etcd for testing

# Downloadcurl -sL https://github.com/coreos/etcd/releases/download/\v3.1.1/etcd-v3.1.1-linux-amd64.tar.gz | tar xz

# And runetcd-v3.1.1-linux-amd64/etcd

Ants Aasma21.02.2017

Page 16: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Setting up Patroni

# Install Patronivirtualenv --quiet patroni-venv && source patroni-venv/bin/activatepip install patroni

# Sample configwget https://github.com/zalando/patroni/raw/master/postgres0.ymlvim postgres0.yml

# Ready to gopatroni postgres0.yml

Ants Aasma21.02.2017

Page 17: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Second node

# Change node name, datadir name and portscat postgres0.yml |\

sed s/postgresql0/postgresql1/ |\sed s/:5432/:5433/ |\sed s/:8008/:8009/ > postgres1.yml

patroni postgres1.yml

Ants Aasma21.02.2017

Page 18: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Setting up a load balancer

# Get the system HAProxy package installedsudo apt-get/yum/... install haproxy

# Get confdcurl -O https://github.com/kelseyhightower/confd/releases/download/\v0.11.0/confd-0.11.0-linux-amd64chmod a+x confd-0.11.0-linux-amd64 && ln -s confd-0.11.0-linux-amd64 confd

# Get Patroni confd config examplescurl -sL https://github.com/zalando/patroni/archive/v1.2.3.tar.gz | tar xz# Adjust HAProxy to run locallysed -i 's#/etc/haproxy' patroni-1.2.3/extras/confd/conf.d/haproxy.tomlsed -i 's#/var/run/#haproxy/#' patroni-1.2.3/extras/confd/conf.d/haproxy.toml

Ants Aasma21.02.2017

Page 19: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Setting up a load balancer continued

# Run confd./confd -prefix=/service/batman -backend etcd \

-node http://localhost:2379/ \-interval 10 \-confdir patroni-1.2.3/extras/confd/

Ants Aasma21.02.2017

Page 20: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Wrapping up

Ants Aasma21.02.2017

Page 21: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

How we avoid split brain

I All nodes try to acquire leader key in DCS.I Leader key has a timeout, the master runs a loop that keeps

updating the leader key.I If DCS gives an error - PostgreSQL gets demotedI If DCS access times out - PostgreSQL gets demotedI If we discover leader key was timed out - PostgreSQL gets

demotedI When there is no leader other nodes try to contact the previous

leader.I If Patroni is not responding, load balancer removes that node

from rotation.I Future versions will have kernel watchdog support

I If Patroni does not get to run, the whole OS gets rebooted.Ants Aasma21.02.2017

Page 22: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Leader elections

I If there is no leader key in the cluster, the remaining nodesI Check if the old leader is still respondingI Contact all other members, the ones with most xlog get to

participate in the leader raceI Check if they are too far behind from last known master xlog

position.

Ants Aasma21.02.2017

Page 23: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

That’s all

Ants Aasma21.02.2017

Page 24: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Thank you

I Questions?I If you need professional support, contact us [email protected]

Ants Aasma21.02.2017

Page 25: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Extra content

Ants Aasma21.02.2017

Page 26: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Configuration

I Configuration is merged from cluster configuration stored inetcd and local configuration given as a parameter.

I Patroni does configuration management for PostgreSQLI You can update Patroni configuration through the REST API

curl -XPATCH \-d '{"postgresql": {"parameters": {"work_mem": "32MB"}}}' \http://localhost:8008/config

I Patroni updates PostgreSQL configuration on all nodes and issuesSIGHUP

Ants Aasma21.02.2017

Page 27: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Other nifty features

I Schedule a node restart in the futureI Optionally, only if there are config changes that require a restartI Optionally, only if still running a specified version

I Voluntary restart runs a checkpoint before shutdownI Run a manual failover

I Schedule a failover in the future

Ants Aasma21.02.2017

Page 28: EasyPostgreSQLClusteringwithPatroni AntsAasma 21.02 · WhyisHAtricky I PostgreSQLprovidessingle-masterreplication I Havingmorethanonemasterisworsethanhavingnone. I Needtoagreeonwhogetstobemaster

Nifty features, continued.

I Clone new nodes from a special node instead of master.I Clone new nodes using a backupI Turn off cluster management to do whatever.

Ants Aasma21.02.2017