getting started with pl/proxy
DESCRIPTION
presentation from PgEast 2011TRANSCRIPT
CC-BY
Getting Started withPL/Proxy
Peter [email protected]
F-Secure Corporation
PostgreSQL Conference East 2011
Concept
• a database partitioning system implemented as aprocedural language
• “sharding”/horizontal partitioning• PostgreSQL’s No(t-only)SQL solution
Concept
application application application application
frontend
partition 1 partition 2 partition 3 partition 4
Areas of Application
• high write load• (high read load)• allow for some “eventual consistency”• have reasonable partitioning keys• use/plan to use server-side functions
ExampleHave:1
CREATE TABLE products (
prod_id serial PRIMARY KEY ,
category integer NOT NULL ,
title varchar (50) NOT NULL ,
actor varchar (50) NOT NULL ,
price numeric (12 ,2) NOT NULL ,
special smallint ,
common_prod_id integer NOT NULL
);
INSERT INTO products VALUES (...);
UPDATE products SET ... WHERE ...;
DELETE FROM products WHERE ...;
plus various queries
1dellstore2 example database
Installation
• Download: http://plproxy.projects.postgresql.org,Deb, RPM, . . .
• Create language: psql -d dellstore2 -f
...../plproxy.sql
Backend Functions ICREATE FUNCTION insert_product(p_category int ,
p_title varchar , p_actor varchar , p_price
numeric , p_special smallint ,
p_common_prod_id int) RETURNS int
LANGUAGE plpgsql
AS $$
DECLARE
cnt int;
BEGIN
INSERT INTO products (category , title ,
actor , price , special , common_prod_id)
VALUES (p_category , p_title , p_actor ,
p_price , p_special , p_common_prod_id);
GET DIAGNOSTICS cnt = ROW_COUNT;
RETURN cnt;
END;
$$;
Backend Functions II
CREATE FUNCTION update_product_price(p_prod_id
int , p_price numeric) RETURNS int
LANGUAGE plpgsql
AS $$
DECLARE
cnt int;
BEGIN
UPDATE products SET price = p_price WHERE
prod_id = p_prod_id;
GET DIAGNOSTICS cnt = ROW_COUNT;
RETURN cnt;
END;
$$;
Backend Functions III
CREATE FUNCTION delete_product_by_title(p_title
varchar) RETURNS int
LANGUAGE plpgsql
AS $$
DECLARE
cnt int;
BEGIN
DELETE FROM products WHERE title = p_title;
GET DIAGNOSTICS cnt = ROW_COUNT;
RETURN cnt;
END;
$$;
Frontend Functions ICREATE FUNCTION insert_product(p_category int ,
p_title varchar , p_actor varchar , p_price
numeric , p_special smallint ,
p_common_prod_id int) RETURNS SETOF int
LANGUAGE plproxy
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON hashtext(p_title);
$$;
CREATE FUNCTION update_product_price(p_prod_id
int , p_price numeric) RETURNS SETOF int
LANGUAGE plproxy
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON ALL;
$$;
Frontend Functions II
CREATE FUNCTION delete_product_by_title(p_title
varchar) RETURNS int
LANGUAGE plpgsql
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON hashtext(p_title);
$$;
Frontend Query Functions I
CREATE FUNCTION get_product_price(p_prod_id
int) RETURNS SETOF numeric
LANGUAGE plproxy
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON ALL;
SELECT price FROM products WHERE prod_id =
p_prod_id;
$$;
Frontend Query Functions II
CREATE FUNCTION
get_products_by_category(p_category int)
RETURNS SETOF products
LANGUAGE plproxy
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON ALL;
SELECT * FROM products WHERE category =
p_category;
$$;
Unpartitioned Small Tables
CREATE FUNCTION insert_category(p_categoryname)
RETURNS SETOF int
LANGUAGE plproxy
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON 0;
$$;
Which Hash Key?
• natural keys (names, descriptions, UUIDs)• not serials (Consider using fewer “ID” fields.)• single columns• group sensibly to allow joins on backend
Set Basic Parameters
• number of partitions (2n), e. g. 8• host names, e. g.
• frontend: dbfe• backends: dbbe1, . . . , dbbe8
• database names, e. g.• frontend: dellstore2• backends: store01, . . . , store08
• user names, e. g. storeapp• hardware:
• frontend: lots of memory, normal disk• backends: full-sized database server
Set Basic Parameters
• number of partitions (2n), e. g. 8• host names, e. g.
• frontend: dbfe• backends: dbbe1, . . . , dbbe8 (or start at 0?)
• database names, e. g.• frontend: dellstore2• backends: store01, . . . , store08 (or start at 0?)
• user names, e. g. storeapp• hardware:
• frontend: lots of memory, normal disk• backends: full-sized database server
Configuration
CREATE FUNCTION
plproxy.get_cluster_partitions(cluster_name
text) RETURNS SETOF text LANGUAGE plpgsql AS
$$...$$;
CREATE FUNCTION
plproxy.get_cluster_version(cluster_name
text) RETURNS int LANGUAGE plpgsql AS
$$...$$;
CREATE FUNCTION plproxy.get_cluster_config(IN
cluster_name text , OUT key text , OUT val
text) RETURNS SETOF record LANGUAGE plpgsql
AS $$...$$;
get_cluster_partitionsSimplistic approach:
CREATE FUNCTION
plproxy.get_cluster_partitions(cluster_name
text) RETURNS SETOF text
LANGUAGE plpgsql
AS $$
BEGIN
IF cluster_name = 'dellstore_cluster ' THEN
RETURN NEXT 'dbname=store01 host=dbbe1 ';
RETURN NEXT 'dbname=store02 host=dbbe2 ';
...
RETURN NEXT 'dbname=store08 host=dbbe8 ';
RETURN;
END IF;
RAISE EXCEPTION 'Unknown cluster ';
END;
$$;
get_cluster_version
Simplistic approach:
CREATE FUNCTION
plproxy.get_cluster_version(cluster_name
text) RETURNS int
LANGUAGE plpgsql
AS $$
BEGIN
IF cluster_name = 'dellstore_cluster ' THEN
RETURN 1;
END IF;
RAISE EXCEPTION 'Unknown cluster ';
END;
$$ LANGUAGE plpgsql;
get_cluster_config
CREATE OR REPLACE FUNCTION
plproxy.get_cluster_config(IN cluster_name
text , OUT key text , OUT val text) RETURNS
SETOF record
LANGUAGE plpgsql
AS $$
BEGIN
-- same config for all clusters
key := 'connection_lifetime ';
val := 30*60; -- 30m
RETURN NEXT;
RETURN;
END;
$$;
Table-Driven Configuration ICREATE TABLE plproxy.partitions (
cluster_name text NOT NULL ,
host text NOT NULL ,
port text NOT NULL ,
dbname text NOT NULL ,
PRIMARY KEY (cluster_name , dbname)
);
INSERT INTO plproxy.partitions VALUES
('dellstore_cluster ', 'dbbe1 ', '5432',
'store01 '),
('dellstore_cluster ', 'dbbe2 ', '5432',
'store02 '),
...
('dellstore_cluster ', 'dbbe8 ', '5432',
'store03 ');
Table-Driven Configuration II
CREATE TABLE plproxy.cluster_users (
cluster_name text NOT NULL ,
remote_user text NOT NULL ,
local_user NOT NULL ,
PRIMARY KEY (cluster_name , remote_user ,
local_user)
);
INSERT INTO plproxy.cluster_users VALUES
('dellstore_cluster ', 'storeapp ', 'storeapp ');
Table-Driven Configuration IIICREATE TABLE plproxy.remote_passwords (
host text NOT NULL ,
port text NOT NULL ,
dbname text NOT NULL ,
remote_user text NOT NULL ,
password text ,
PRIMARY KEY (host , port , dbname ,
remote_user)
);
INSERT INTO plproxy.remote_passwords VALUES
('dbbe1 ', '5432', 'store01 ', 'storeapp ',
'Thu1Ued0 '),
...
-- or use .pgpass?
Table-Driven Configuration IV
CREATE TABLE plproxy.cluster_version (
id int PRIMARY KEY
);
INSERT INTO plproxy.cluster_version VALUES (1);
GRANT SELECT ON plproxy.cluster_version TO
PUBLIC;
/* extra credit: write trigger that changes the
version when one of the other tables changes
*/
Table-Driven Configuration VCREATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(p_cluster_name text)
RETURNS SETOF textLANGUAGE plpgsqlSECURITY DEFINERAS $$DECLARE
r record;BEGIN
FOR r INSELECT 'host=' || host || ' port=' || port || ' dbname=' || dbname || '
user=' || remote_user || ' password=' || password AS dsnFROM plproxy.partitions NATURAL JOIN plproxy.cluster_users NATURAL JOIN
plproxy.remote_passwordsWHERE cluster_name = p_cluster_nameAND local_user = session_userORDER BY dbname -- important
LOOPRETURN NEXT r.dsn;
END LOOP;IF NOT found THEN
RAISE EXCEPTION 'no such cluster: %', p_cluster_name;END IF;RETURN;
END;$$;
Table-Driven Configuration VI
CREATE FUNCTION
plproxy.get_cluster_version(p_cluster_name
text) RETURNS int
LANGUAGE plpgsql
AS $$
DECLARE
ret int;
BEGIN
SELECT INTO ret id FROM
plproxy.cluster_version;
RETURN ret;
END;
$$;
SQL/MED ConfigurationCREATE SERVER dellstore_cluster FOREIGN DATA
WRAPPER plproxy
OPTIONS (
connection_lifetime '1800',
p0 'dbname=store01 host=dbbe1 ',
p1 'dbname=store02 host=dbbe2 ',
...
p7 'dbname=store08 host=dbbe8 '
);
CREATE USER MAPPING FOR storeapp SERVER
dellstore_cluster
OPTIONS (user 'storeapp ', password
'sekret ');
GRANT USAGE ON SERVER dellstore_cluster TO
storeapp;
Hash Functions
RUN ON hashtext(somecolumn);
• want a fast, uniform hash function• typically use hashtext
• problem: implementation might change• possible solution: https://github.com/petere/pgvihash
Sequences
shard 1:
ALTER SEQUENCE products_prod_id_seq MINVALUE 1
MAXVALUE 100000000 START 1;
shard 2:
ALTER SEQUENCE products_prod_id_seq MINVALUE
100000001 MAXVALUE 200000000 START 100000001;
etc.
AggregatesExample: count all productsBackend:
CREATE FUNCTION count_products () RETURNS bigint
LANGUAGE SQL STABLE AS $$SELECT count (*)
FROM products$$;
Frontend:
CREATE FUNCTION count_products () RETURNS SETOF
bigint LANGUAGE plproxy AS $$
CLUSTER 'dellstore_cluster ';
RUN ON ALL;
$$;
SELECT sum(x) AS count FROM count_products () AS
t(x);
Dynamic Queries Ia. k. a. “cheating” ;-)
CREATE FUNCTION execute_query(sql text) RETURNS
SETOF RECORD LANGUAGE plproxy
AS $$
CLUSTER 'dellstore_cluster ';
RUN ON ALL;
$$;
CREATE FUNCTION execute_query(sql text) RETURNS
SETOF RECORD LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY EXECUTE sql;
END;
$$;
Dynamic Queries II
SELECT * FROM execute_query('SELECT title ,
price FROM products ') AS (title varchar ,
price numeric);
SELECT category , sum(sum_price) FROM
execute_query('SELECT category , sum(price)
FROM products GROUP BY category ') AS
(category int , sum_price numeric) GROUP BY
category;
Repartitioning
• changing partitioning key is extremely cumbersome• adding partitions is somewhat cumbersome, e. g., to split
shard 0:
COPY (SELECT * FROM products WHERE
hashtext(title::text) & 15 <> 0) TO
'somewhere ';
DELETE FROM products WHERE
hashtext(title::text) & 15 <> 0;
Better start out with enough partitions!
PgBouncer
application application application application
frontend
PgBouncer PgBouncer PgBouncer PgBouncer
partition 1 partition 2 partition 3 partition 4
Use
pool_mode = statement
Development Issues
• foreign keys• notifications• hash key check constraints• testing (pgTAP), no validator
Administration
• centralized logging• distributed shell (dsh)• query canceling/timeouts• access control, firewalling• deployment
High Availability
Frontend:• multiple frontends (DNS, load balancer?)• replicate partition configuration (Slony, Bucardo, WAL)• Heartbeat, UCARP, etc.
Backend:• replicate backends shards individually (Slony, WAL, DRBD)• use partition configuration to configure load spreading or
failover
Advanced Topics
• generic insert, update, delete functions• frontend joins• backend joins• finding balance between function interface and dynamic
queries• arrays, SPLIT BY
• use for remote database calls• cross-shard calls• SQL/MED (foreign table) integration
The End