stored procedures are good enough

32
Nikolay Samokhvalov Twitter: @postgresmen [email protected]

Upload: nikolay-samokhvalov

Post on 21-Jan-2018

201 views

Category:

Technology


3 download

TRANSCRIPT

Page 2: Stored Procedures are Good Enough

History

Year of Birth: 1995

Page 3: Stored Procedures are Good Enough

History

1995: Postgres95 – POSTQUEL query language replaced with SQL

Page 4: Stored Procedures are Good Enough

History

1995: Postgres95 – POSTQUEL query language replaced with SQL

1996: Postgres95 departed from academia, renamed to PostgreSQL

Page 5: Stored Procedures are Good Enough

History

1995: Postgres95 – POSTQUEL query language replaced with SQL

1996: Postgres95 departed from academia, renamed to PostgreSQL

1998: PL/pgSQL added (PostgreSQL 6.4)

Page 6: Stored Procedures are Good Enough

And a bit more history...

Object Management in POSTGRES Using ProceduresM. Stonebraker

http://www.dtic.mil/dtic/tr/fulltext/u2/a181411.pdf

Page 7: Stored Procedures are Good Enough

What’s now?- Postgres speaks a lot of PL languages:

- “native”: PL/pgSQL- included: PL/Tcl, PL/Perl, PL/Python- additional-traditional: PL/Java, PL/R, PL/sh, PL/v8 (JavaScript)- not active: PL/Scheme, PL/PHP, PL/Ruby- special/exotic/new:

- PL/Proxy (sharding, from Skype), - PL/Container (Python, R), - plgo (Go), etc.- PgOpenCL (GPU!)

- Functions can also be created in:- C (anything is possible!)- SQL (plain! standard! with [recursive] CTEs!)

Page 8: Stored Procedures are Good Enough

What are Stored Procedures?

In Postgres:Functions = UDFs (user-defined functions) = Stored Procedures

(in other DBMSes: you can include your function/UDF to a SELECT,while you can only PERFORM/EXEC/EXECUTE a stored procedure)

Page 9: Stored Procedures are Good Enough

Functions & Triggers

Page 10: Stored Procedures are Good Enough

Functions & Triggers

Page 11: Stored Procedures are Good Enough

Why?

Page 12: Stored Procedures are Good Enough

Reason #1: Data Clearness & Integrity

Data Checks (format, constraints, etc)(Ruby or Python or PHP or …)

Page 13: Stored Procedures are Good Enough

Reason #1: Data Clearness & Integrity

Data Checks (format, constraints, etc)in App (Ruby or Python or PHP or …)

Page 14: Stored Procedures are Good Enough

Reason #1: Data Clearness & Integrity

Data Checks (format, constraints, etc)in App (Ruby or Python or PHP or …)

Page 15: Stored Procedures are Good Enough

Reason #1: Data Clearness & Integrity

App (Ruby or Python or PHP or …)

CHECKS

Page 16: Stored Procedures are Good Enough

Reason #1: Data Clearness & Integrity

App (Ruby or Python or PHP or …)

CHECKS

Control your Data Quality

Page 17: Stored Procedures are Good Enough

Data Validation, an example: validate email address

Source: https://www.postgresql.org/message-id/20050907175305.GA20501%40isis.sigpipe.cz

Page 18: Stored Procedures are Good Enough

Reason #2: Access Control

- SECURITY DEFINER allows a user to do what she/he cannot usually do (but under strict control)- GRANT/REVOKE – a standard way to control permissions - Good approach: forbid direct access to tables, provide functions and views with proper GRANTs- Pay attention to:

- objects (tables, views, functions)- columns (can REVOKE/GRANT individually!)- rows (check what Row-Level Security is)

Page 19: Stored Procedures are Good Enough

Reason #3: speed (first of all, IO/network-related)

DBMS (Postgres 9.6) – AWS RDS, USA,Client (psql) – somewhere in Germany.Getting all 10M rows is ~7x slower

Use your RDBMS for Data Manipulation. It is not just a Storage.

Page 20: Stored Procedures are Good Enough

Reason #3: speedThere are a LOT of cases here.

- ORMs (ActiveRecord, Hibernate, etc) and how people work with them- Analytics (doing R or python calculations inside RAM, etc)- Massive data updates (retrieve IDs and then DELETE rows? Doh.

Just look around and you’ll find more.

Again: Work with Data Inside Database First.

Pay attention to:- cardinality (how many rows you touch?)- RTT (round trip time), reduce network calls

Page 21: Stored Procedures are Good Enough

Reason #4: Data Integration

Data Manipulation Logicin App (Ruby or Python or PHP or …)

Something*

* ElasticSearch, Sphix, Analytics DBMS, etc

Page 22: Stored Procedures are Good Enough

Reason #4: Data Integration

Data Manipulation Logicin App (Ruby or Python or PHP or …)

Something*

* ElasticSearch, Sphix, Analytics DBMS, etc

Page 23: Stored Procedures are Good Enough

Reason #4: Data Integration

Data Manipulation Logicin App (Ruby or Python or PHP or …)

Something*

* ElasticSearch, Sphix, Analytics DBMS, etc

Page 24: Stored Procedures are Good Enough

Reason #4: Data Integration

App (Ruby or Python or PHP or …)

Something*

* ElasticSearch, Sphix, Analytics DBMS, etc

DataManipulation

Use:- functions, triggers,- Foreign Data Wrappers (FDW),- Logical Decoding (e.g. pglogical)

Page 25: Stored Procedures are Good Enough

#5: HTTP API w/o middleware, “declarative”http://postgrest.com - PostgREST

Written in HaskellMIT licenseActively developing

chat: https://gitter.im/begriffs/postgrest

CREATE VIEW v1.person

AS SELECT * FROM public.person; → /person

CREATE FUNCTION v1.myfunc(...) … → /rpc/myfunc

LANGUAGE ...;

(write functions in any language: SQL, plpgsql, plpython, plr, plv8, etc!)

GET → SELECTPOST → INSERTPATCH → UPDATEDELETE → DELETE

Only POST

Page 26: Stored Procedures are Good Enough

#6: PL/Proxy: sharding

- All work via functions- Special functions (in PL/Proxy “language”) are in the

middle- Developed in Skype, and still there- Yandex.Mail migrated from Oracle to Postgres +

PL/Proxy in 2014-2016 (300+ TB, 250k RPS)

Page 27: Stored Procedures are Good Enough

#6: PL/Proxy: sharding

Page 28: Stored Procedures are Good Enough

#7: MADlib: Machine Learning inside your DBMS

- A lot of ML algorithms implemented (added in each release)- PL/Python- Very easy and quick start to do machine learning with your Postgres data

http://madlib.incubator.apache.org/

Page 29: Stored Procedures are Good Enough

Cons● Tooling can be considered week (packaging, dependences, editors,

debugging, profiling, etc)

● Version control and schema migrations

● Testing

● Stored Procedures consume resource in DBMS. Can be tricky to scale○ Example: call external API via plpythonu function and save data -- consumes CPU on your

server unpredictably!

Page 30: Stored Procedures are Good Enough

Cons - fixes● Tooling can be considered week (packaging, dependences, editors,

debugging, profiling, etc) vim+plpgsql highlighting; DataGrip, Debugger, Profiler (pgAdmin)

● Version control and schema migrations Sqitch and others

● Testing pgTAP

● Stored Procedures are consuming resource in DBMS. Can be tricky to scale○ Example: call external API via plpythonu function and save data -- consumes CPU on your

server unpredictably!

Avoid I/O things inside your master if you need to scale

Page 31: Stored Procedures are Good Enough
Page 32: Stored Procedures are Good Enough

Thank you!

Twitter: @postgresmen (new Postgres tweets daily!)

[email protected]

RuPostgres.org