asynchronous replication for postgresql slony

37
Replicating PostgreSQL Databases Using Slony-I Christopher Browne Afilias Canada An introduction to the use of Slony-I, an asynchronous single-master to multiple slaves replication system for use with PostgreSQL

Upload: elliando-dias

Post on 10-May-2015

3.781 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Asynchronous Replication for PostgreSQL Slony

Replicating PostgreSQLDatabases Using Slony-I

Christopher BrowneAfilias Canada

An introduction to the use of Slony-I, anasynchronous single-master to multipleslaves replication system for use with

PostgreSQL

Page 2: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I is an asynchronous single masterto multiple slave replication system forPostgreSQL supporting multipledatabase releases, cascadedreplication, and slave promotion.

Page 3: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I is a replication system for PostgreSQL

Replication: Updates applied to one databaseare applied to another database

INSERT, DELETE, UPDATE

Some replication systems intercepttransaction logs; Slony-I uses triggers ontables to collect changes

Page 4: Asynchronous Replication for PostgreSQL Slony

Why Replicate?

● Redundancy – allowing rapidrecovery from some hardware failures

● Separate reporting from on-lineactivity: Performance

● Separate reporting from on-linesystem: Security

Page 5: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I is an asynchronous replication system

Asynchronous: Updates are COMMITted to onedatabase, and are applied to others later(possibly much later)

Contrast with synchronous, where updatesmust commit on multiple hosts at the sametime

Page 6: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I is an asynchronous single master tomultiple slave replication system

Updates are applied at the origin, and arereplicated to a set of subscribers

Slony-I allows each table to have a differentorigin – but only one origin!

Alternative: multimaster – more complex,heavy locking costs and/or conflictresolution problems

Page 7: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I supports multiple PostgreSQLreleases

Supports PostgreSQL versions 7.3.3 andhigher

Earlier versions not supported – nonamespaces

Page 8: Asynchronous Replication for PostgreSQL Slony

Preparing version upgrade

● Prepare upgrade...

7.3.9 8.0.3

Master

Replicate

SSlave

Slony-I may take 48h to populate the replica, butthat requires no outage on the master system

Page 9: Asynchronous Replication for PostgreSQL Slony

Database version upgrade (2)

● Once “in sync,” MOVE SET takesseconds

7.3.9 8.0.3

SlavemovesetS

Master

The former master can become slave; this provides afall back strategy “Just in Case”

Page 10: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I supports cascaded replication

NY1Master NY2

NYC Data Center

LA1 LA2

LA Data Center

WAN

Several LA servers feeding from justone source

LA3

Page 11: Asynchronous Replication for PostgreSQL Slony

What is Slony-I?

Slony-I supports slave promotion

NY1 NY2

NYC Data Center

LA1

MasterLA2

LA Data Center

LA3MOVE SET shifts origin to LA

Page 12: Asynchronous Replication for PostgreSQL Slony

Uses for Slave Promotion

● Upgrades: Set up subscriber runningnew PG version, then promote it to“master”

● Maintenance: Shift origin to a newserver allowing extended maintenance

● Avoid failure: Shift origin from afailing server to one that is “lessailing.”

Page 13: Asynchronous Replication for PostgreSQL Slony

Fail Over

● For extreme cases of problems withmaster, Slony-I supports full scale fail-over

● Since Slony-I is asynchronous, sometransactions committed on the masterwon't have made it to other nodes

● As a result, FAIL OVER must lead todropping the failed node

Page 14: Asynchronous Replication for PostgreSQL Slony

Slony-I Components

● Databases – one for each node

● Slon daemon – one for each node

● Slonik – configuration controller

● Virtual: Network configuration

Page 15: Asynchronous Replication for PostgreSQL Slony

Components: Databases

Each node is a PostgreSQL database with:

● C libraries to implement trigger functions

● pl/pgsql functions for non-time-critical code

● Database namespace/schema forconfiguration

● On origin nodes – triggers to capture updates

● On subscriber nodes – triggers to protect data

● Slony-I processes connect as a superuser, e.g.slony

Page 16: Asynchronous Replication for PostgreSQL Slony

Components: slon daemons

Each node has a slon instance.

This is a C program that propagatesevents between nodes

● Most event types are for configuringSlony-I

● SYNC events are where data“providers” tell subscribers that theyhave new data to replicate

Page 17: Asynchronous Replication for PostgreSQL Slony

Components: slonik

The slonik command processor takesconfiguration scripts in an SQL-like languageand submits them to the cluster

● STORE NODE, STORE PATH, STORE LISTEN

● CREATE SET, SET (ADD|DROP|MOVE) TABLE, SET (ADD|DROP|MOVE) SEQUENCE

● SUBSCRIBE SET, LOCK SET, MOVE SET, FAIL OVER,EXECUTE SCRIPT

Slonik is only used when modifyingconfiguration; when system is stable, itremains unused

Page 18: Asynchronous Replication for PostgreSQL Slony

Components: NetworkConfiguration

● Configuration paths from “adminworkstation” to all nodes – connectionsmay be temporary and/or slow butmust be comprehensive

● Communications paths between slondaemons and Slony-I nodes – need tobe fast and reliable; create only thelinks you need

Page 19: Asynchronous Replication for PostgreSQL Slony

Administrative Connections

● slonik needs to communicate withevery node

Page 20: Asynchronous Replication for PostgreSQL Slony

Replication Connections

● Persistent, reliable, fast 2-way connectionsbetween some nodes

Redundancywanted!

Page 21: Asynchronous Replication for PostgreSQL Slony

Configuration Terminology

● Cluster – the set of databases

● Node – each database in the cluster

● Replication set – a set of tables andsequences replicated from a single origin

● Origin, subscribers, providers – theshape of the replication “network”

● Paths – routes slon uses to talk to DBs

● Listen paths – determine path usage

Page 22: Asynchronous Replication for PostgreSQL Slony

Cluster

A Slony-I cluster is the set of databaseinstances where replication takes place.

Give the thing being replicated a name:

● ORG, Payroll, PriceDB, STorm

The name identifies the schema used to storeSlony-I data, thus _ORG, _Payroll, _PriceDB, ...

Slonik scripts specify: cluster name=ORG;

Slon daemons are passed the cluster name:

slon -d4 -g80 STorm 'host=db1 db=nws_storm'

Page 23: Asynchronous Replication for PostgreSQL Slony

Node

Each PostgreSQL database being used forreplication is a Slony-I “node”

Each has a schema containing Slony-I-specifictables, sequences, functions

Cluster T1 has the following tables:_T1.sl_config_lock _T1.sl_confirm _T1.sl_event _T1.sl_listen_T1.sl_log_1 _T1.sl_log_2 _T1.sl_log_status _T1.sl_node _T1.sl_path_T1.sl_seqlastvalue _T1.sl_seqlog _T1.sl_sequence _T1.sl_set_T1.sl_setsync _T1.sl_status _T1.sl_subscribe _T1.sl_table_T1.sl_trigger

Slonik commands: store node, drop node,uninstall node

Page 24: Asynchronous Replication for PostgreSQL Slony

Replication Sets

● Replication sets are the “container” for eachset of tables and sequences being replicated

● Each set's data originates on one node andis published to other nodes

● By having multiple sets with multiple origins,you can get a sort of multimaster replication

Slonik commands: create set, drop set,subscribe set, merge set, set add table, setadd sequence, ...

Page 25: Asynchronous Replication for PostgreSQL Slony

Weak Form of Multimaster Replication

● DB1 is origin for table tab1 in set 1● DB2 is origin for table tab2 in set 2● DB3 is origin for table tab3 in set 3

DB1 DB2 DB3

tab1 tab2

tab2 tab3

tab3

Three replication sets can have different propagationnetworks

Page 26: Asynchronous Replication for PostgreSQL Slony

Origin, Provider, Subscriber

The terms “master” and “slave” becomeinaccurate if there are more than 2 nodes

Nodes are not themselves either master orslave

For each replication set, and hence for eachtable, there is exactly one “origin”

All the other nodes can be “subscribers”Each subscriber draws data from a “provider”which may either be the origin, or asubscriber downstream from the origin.

Page 27: Asynchronous Replication for PostgreSQL Slony

Admin Paths

● One variety of “communications path” is theone used by slonik from “admin server” toall nodes

● These are encoded in a preamble to eachslonik script

● There needs to be one 'admin conninfo' pathto each Slony-I node

● These are used sporadically, just to doconfiguration, so they could be temporary(e.g. - SSH tunnels)

Page 28: Asynchronous Replication for PostgreSQL Slony

Paths Between Nodes

● The store path command stores the pathsused so the slon for node X knows how toaccess the database for node Y

● If there are n nodes, there could be as manyas n(n-1) paths in sl_path; no less than 2n

● Multiple sites with complex firewall policiesmay mean nonuniform communicationspaths

● But there still must be some form of 2-waycommunications between sites

Page 29: Asynchronous Replication for PostgreSQL Slony

Complex Path Scenario

db1

db2

db3

network #1

db4

db5

db6

network #2

Tightly interconnected

in network #1

Tightly interconnected

in network #2

But between the networks, onlydb2 and db4 talk to one another

Page 30: Asynchronous Replication for PostgreSQL Slony

Security Considerations

● slonik and slon must connect as PostgreSQLsuperuser because they do DB alterations

● In practice, events propagate so any actioncould come from anywhere

● Connection issues and mechanisms are thesame as for any software using libpq

● Slony-I doesn't mess around with users orroles; you manage security your way

Page 31: Asynchronous Replication for PostgreSQL Slony

.pgpass for Password Storage

Use .pgpass for storing passwords● Removes passwords from command lines● Removes passwords from environmentvariables

● Removes passwords from sl_pathAll of those are vulnerable to capture

Page 32: Asynchronous Replication for PostgreSQL Slony

SSH Connections

Consider using ssh connections acrossuntrusted networks

● Slony-I opens connections infrequently –costs are amortized over much usage

● Presently, PostgreSQL has decent supportfor server certificates but client certificatesnot usable for authentication

● Use of client certificates for authenticationwould provide a more “opaque” token than.pgpass – future PostgreSQL enhancement

Page 33: Asynchronous Replication for PostgreSQL Slony

Slony-I User

Use of “slony” PG user to run Slony-Iprocesses is highly recommended

● Makes replication connections highlyidentifiable

● Makes it easy to lock out all but the slonyuser when running COPY_SET

● Separating maintenance roles to multipleusers (molly, dumpy, slony) has provenuseful.

Page 34: Asynchronous Replication for PostgreSQL Slony

Security – Log Shipping

● Conventional nodes require 2-waycommunications between participatingservers

● Log shipping allows serializing updatesto files

● Transmit (1-way!) via FTP, scp, rsync,DVD-ROM, USB Key, RFC 1149, 2549

Page 35: Asynchronous Replication for PostgreSQL Slony

Sometimes Slony-I isn't theAnswer

● If you truly, honest to goodness,need multimaster, watch for Slony-II...

● If you need DDL to be handledautomagically, look at PG 8.0 PITR

● If you have a loosely run environment,Slony-I will not go well

● No answer yet for async multimaster!

Page 36: Asynchronous Replication for PostgreSQL Slony

More Cases Not to Use Slony-I

● Slony-I needs to define a node (andspawn a slon + connections) for eachdatabase – replicating 80 databaseson one cluster may not turn out

● If you have 300 sequences, all 300are propagated with each sync – thismay not turn out well

Page 37: Asynchronous Replication for PostgreSQL Slony

Summary

● What is Slony-I and what does it do?

● What are the components of Slony-I?

● What are some security issuessurrounding the use of Slony-I?