pro postgresql, oscon 2008

Download Pro PostgreSQL, OSCon 2008

If you can't read please download the document

Upload: robert-treat

Post on 16-Apr-2017

12.540 views

Category:

Technology


0 download

TRANSCRIPT

Pro PostgreSQL

Robert Treatomniti.combrighterlamp.org

Who Am I? (Why Listen To Me)

O-0

PostgreSQL User Since 6.5.x

DBA of High Traffic / Large PostgreSQL Instances

Long Time Contributor to PostgreSQL Project

Contribute / Maintain Several Open Source Projects

Co-Author Beginning PHP & PostgreSQL 8 (Apress)

Outline

O-1

What you need to know about the project

Getting started

Upgrading

Configuring your server

Hardware

Availability

Scalability

Query tuning

Tablespaces

Partitioning

Stuff you should know about

K-0

Know Your Way Around The Project

Know Your Way Around The Project

K-1

www.postgresql.org

downloads

documentation

bug reports

security alerts

wiki

support companies

rss > news events - versions

Know Your Way Around The Project

K-2

www.pgfoundry.orgprojects.postgresql.org

Modules

Programs

Resources

URI Type

CIText

SkyTools

Npgsql

Pl/Proxy

pg_bulkload

plpgsql-debugger

sample databases

Know Your Way Around The Project

K-3

www.planetpostgresql.org

Project News

Community News

Helpful Tips / Examples

Know Your Way Around The Project

K-4

archives.postgresql.org

mailing list archives back to 1997

full text search via postgtresql 8.3

keyword search suggestions

lists for users, developers, regional, user groups

Know Your Way Around The Project

K-5

#postgresql

irc.freenode.net

real time help

rtfm_please - ??help

Know Your Way Around The Project

K-6

project management

core team

committers

-hackers

roadmap

web team

S-0

Get Off To A Good Start

S-1

Get Off To A Good Start

Use package management

Consistent

Standardized

Simple

Different across systems

Upgrades are an issue

Trust your packager?

S-2

Get Off To A Good Start

Use package management

Different across systems

Upgrades are an issue

Trust your packager?

S-2

Get Off To A Good Start

Use package management

Don't Be Afraid To Roll Your Own

S-4

Get Off To A Good Start

$PGDATA/pg_log

/var/log/pgsql

when in doubt... (postgresql.conf)

separate disk

Configure Logging

Logging is often overlooked, but is the first step toward troubleshooting!

S-5

Get Off To A Good Start

most systems have different defaults

firewalls/ selinux (FATAL)

rtfm (pg_hba.conf, grant, revoke)

Configure Authentication

S-6

Get Off To A Good Start

TRUST

md5

IDENT

Authentication Methods

S-7

Get Off To A Good Start

trust these more than your own code

package dependent

use different schemas (when able)

pgcrypto

pgstatstuple, pg_buffercache, pg_freespacemap

/contrib

S-8

Get Off To A Good Start

package dependent

some are non-core (plruby, plr, plphp)

varying functionality

varying levels of trust

don't be afraid, test!

procedural languages

U-0

Let's Talk About Upgrades

U-1

Let's Talk About Upgrades

Versioning

First Digit (7.4.16 -> 8.2.0)

Second Digit (8.2.4 -> 8.3.0)

Third Digit (8.3.0 -> 8.3.1)

U-2

Let's Talk About Upgrades

Versioning

First Digit (7.4.16 -> 8.2.0)

Second Digit (8.2.4 -> 8.3.0)

Third Digit (8.3.0 -> 8.3.1)

U-3

Let's Talk About Upgrades

Versioning

First Digit (7.4.16 -> 8.2.0)

Second Digit (8.2.4 -> 8.3.0)

Third Digit (8.3.0 -> 8.3.1)

U-4

Let's Talk About Upgrades

Achtung!!

Make Backups!

Read the Release Notes!

U-5

Let's Talk About Upgrades

pg_dump/pg_restore

simple

-Fc is your friend

dump with new version of pg_dump

pitfalls (time, hdd)

U-6

Let's Talk About Upgrades

the slony method

not simple

create slave on new version

switchover (switch back?)

pitfalls (initial synch, compatibility)

U-7

Let's Talk About Upgrades

pg_migrator

in place upgrades

rewrites system catalog info

no way to go back (fs snapshots)

still new, under-flux

8.1 -> 8.2 only (for now)

U-8

Let's Talk About Upgrades

upgrading older db

cold standby

8.2 -> warm standby

8.4 -> hot standby ?

A-6

Availability

slony

asynchronous, master-slave replication

controlled switchover, failover

low i/o, time constraints

other benefits (upgrades, scaling)

A-7

Availability

bucardo

asynchronous, multi-master replication

also does master-slave

low i/o, time constraints

other benefits (upgrades, scaling)

A-8

Availability

shared disk

one copy of PGDATA on shared storage

standby takes over akin to db crash

shared disk is point of failure (raid)

STONITH

A-9

Availability

filesystem replication

drbd

filesystem mirrored between servers

synchronized, ordered writes

single disk system?

A-10

Availability

pgpool

dual-master, statement based

little caveats (random(),now(),sequences)

bigger caveats (security, password, pg_hba)

pgpool becomes failure point

A-11

Availability

postgres-r

multi-master, synchronous

just open sourced this month!

small community

not proven

H-0

Scalability

H-1

Scalability

what is scaling?

How well a solution to some problem will work when the relative size of the problem increases

- Theo Schlossnagle

H-2

Scalability

bigger, better, faster, more!

postgresql scales up pretty well

more disks (tablespaces)

more cpu's, more ram

connection pooling

1000+ connections, TB+ data

H-3

Scalability

pgpool

dual-master, statement based

little caveats (random(),now(),sequences)

bigger caveats (security, password, pg_hba)

pgpool becomes failure point

H-4

Scalability

pg_bouncer

simple connection pooler

10/1 -> 40/1

caveats (prepared statements, temp tables)

skype, myyearbook.com

H-5

Scalability

slony

asynchronous, master-slave replication

multiple, cascading slaves

scales read operations

other benefits (upgrades, scaling)

solid user base

H-6

Scalability

bucardo

asynchronous, multi-master replication

also does master-slave

low i/o, time constraints

other benefits (upgrades, scaling)

H-7

Scalability

pgpool-II

single db over multiple machines

scales read operations

replication, load balance, parallel query

green technology

H-8

Scalability

pgcluster

synchronous multi-master replication

significant complexity

scales read operations

other uses (failover abilities)

green technology

H-9

Scalability

postgres-r

multi-master, synchronous

just open source this month!

small community

other uses (failover abilities)

not proven

H-10

Scalability

pitr read-only slaves

based on pitr, warm standby operation

core team officially supporting development

8.4 -> synchronous wal shipping

8.? -> read only slaves

J-0

Query Your Queries

J-1

Query Your Queries

finding slow queries:log_min_duration_statement

-1, 0 , n

superuser only

alter user

LOG: duration: 5005.273 ms statement: select pg_sleep(5);

J-2

Query Your Queries

finding slow queries:pgfouine / pqa

log analyzers

command line, generate reports

i/o load

http://pgfouine.projects.postgresql.org/reports.htmlhttp://pqa.projects.postgresql.org/example.html

J-3

Query Your Queries

finding slow queries:pg_stat_all_tables

pagila=# \d pg_stat_all_tablesView "pg_catalog.pg_stat_all_tables" Column | Type |------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |

J-4

Query Your Queries

finding slow queries:pg_stat_all_tables

pagila=# \d pg_stat_all_tablesView "pg_catalog.pg_stat_all_tables" Column | Type |------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |

J-5

Query Your Queries

finding slow queries:pg_stat_all_tables

pagila=# \d pg_stat_all_tablesView "pg_catalog.pg_stat_all_tables" Column | Type |------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |

J-6

Query Your Queries

finding slow queries:pg_stat_all_tables

pagila=# \d pg_stat_all_tablesView "pg_catalog.pg_stat_all_tables" Column | Type |------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |

J-7

Query Your Queries

finding slow queries:pg_stat_all_tables

pagila=# \d pg_stat_all_tablesView "pg_catalog.pg_stat_all_tables" Column | Type |------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |

J-8

Query Your Queries

finding slow queries:pg_stat_all_tables

pagila=# \d pg_stat_all_tablesView "pg_catalog.pg_stat_all_tables" Column | Type |------------------+-------------+ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |

J-9

Query Your Queries

finding slow queries:pg_stat_all_indexes

pagila=# \d pg_stat_all_indexesView "pg_catalog.pg_stat_all_indexes" Column | Type |---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |

J-10

Query Your Queries

finding slow queries:pg_stat_all_indexes

pagila=# \d pg_stat_all_indexesView "pg_catalog.pg_stat_all_indexes" Column | Type |---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |

J-11

Query Your Queries

finding slow queries:pg_statio_all_tables

pagila=# \d pg_statio_all_tablesView "pg_catalog.pg_statio_all_tables" Column | Type |-----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |

J-12

Query Your Queries

finding slow queries:pg_statio_all_tables

pagila=# \d pg_statio_all_tablesView "pg_catalog.pg_statio_all_tables" Column | Type |-----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |

J-13

Query Your Queries

fixing slow queries:explain analyze

universal tool

good for specific queries

explain for large queries

could be it's own talk

J-14

Query Your Queries

fixing slow queries:explain analyze

universal tool

good for specific queries

explain for large queries

could be it's own talk

http://wiki.postgresql.org/Using_EXPLAIN

J-15

Query Your Queries

fixing slow queries:indexing (basic)

use explain to find large sequential reads

use pg_stat_* tables to find numerous reads

btree (gist/gin)

enable_indexscan, enable_bitmapscan

dual column vs. single column

J-16

Query Your Queries

fixing slow queries:indexing (partial)

create index address_ba_part_idx on address (district) where district = 'Buenos Aires';

restrain index to rows that matter

can give significant speed improvements

where clause of index should match

where clause of query

J-17

Query Your Queries

fixing slow queries:indexing (partial)

create index customer_active_part_idx on customer (customer_id) where activebool is true;

restrain index to rows that matter

can give significant speed improvements

where clause of index should match

where clause of query

J-18

Query Your Queries

fixing slow queries:indexing (functional)

some people prefer to call these expressional indexes

J-19

Query Your Queries

fixing slow queries:indexing (expressional)

create unique index one_true_email_xidx on customer (lower(email));

push expensive functions into your index

system sees just WHERE indexedcolumn = 'constant'

expression of index should match expression of queries

narrow scope, but nice gains

J-20

Query Your Queries

fixing slow queries:indexing (expressional)

create index fullname_xidx on customer ((first_name||' '||last_name));

push expensive functions into your index

system sees just WHERE indexedcolumn = 'constant'

expression of index should match expression of queries

narrow scope, but nice gains

J-21

Query Your Queries

fixing slow queries:full text search

uses lexmes and word stemming to find common words

replacement for LIKE '%x%', ~* 'x';

supports multiple languages, custom dictionaries

special indexing options

J-22

Indexing Options

full text indexinggist vs. gin

old school

slower for queries

faster insert / update

mature

new in 8.2

faster for queries

slower insert / update

stable

N-0

PostgreSQL Tablespaces

N-1

PostgreSQL Tablespaces

tablespaces?

define logical locations for object placement

point to locations on disk (uses symlinks)

size determined by disk size (not pre-ordained)

dedicate per db, split db across multiple tblspc

N-2

PostgreSQL Tablespaces

tablespaces!

split database over separate disks

use stat, statio tables to gauge disk access

create dedicated storage for workloads

disk for read / write

disk for read only

large, slow disk for archiving

disk for indexes

Q-0

PostgreSQL Partitioning

Q-1

PostgreSQL Partitioning

partitioning?

as table size grows, it becomes unmanageable

use inheritance, rules, constraints to split data

queries ignore non-relevant partitions

could be it's own talk

Q-2

PostgreSQL Partitioning

partitioning!

as table size grows, it becomes unmanageable

use inheritance, rules, constraints to split data

queries ignore non-relevant partitions

could be it's own talk

http://www.pgcon.org/2007/schedule/events/41.en.html

Q-3

PostgreSQL Partitioning

partitioning : key points

determine list vs. range

use triggers rather than rules

partition creation vs. data population

automate maintenance

I-0

Other Stuff I Should Mention

I-1

Other Stuff I Should Mention

pgcrypto

cryptography type functions

/contrib (export issues)

md5, sha1, blowfish, many more

I-2

Other Stuff I Should Mention

dblink

pg -> pg connections

/contrib (still under development?)

can have performance issues on large queries

make it live in it's own schema

I-3

Other Stuff I Should Mention

*-link

heterogenous connections for postgresql

db specific and db independent options

any pl/u language can implement this

similar performance issues to dblink

dblink-tds, dbi-link, oralink, odbclink

http://www.pgfoundry.org/ (db link)

I-4

Other Stuff I Should Mention

autonomous logging tool

persistent logging for postgresql functions

built on top of dblink

make it live in it's own schema

https://labs.omniti.com/trac/pgsoltools

I-5

Other Stuff I Should Mention

snapshot pitr clones

full read/write copy of pitr slave

static snapshot

need solaris (zfs zone mojo)

could re-implement on other systems

https://labs.omniti.com/trac/pgsoltools

I-6

Other Stuff I Should Mention

check_postgres

nagios based monitoring script

common items for warnings and alerts

can be adapted to other uses

http://bucardo.org/check_postgres

I-7

Other Stuff I Should Mention

reconnoiter

monitoring / graphing tool

postgres based

still pretty green

https://labs.omniti.com/trac/reconnoiter

I-8

Other Stuff I Should Mention

phpPgAdmin

web based gui for postgresql

remote administration of multiple servers

implements much of postgresql functionality

support back to 7.2?

http://phppgadmin.sourceforge.net/

I-9

Other Stuff I Should Mention ;-)

my book?

I-10

Other Stuff I Should Mention ;-)

we're hiring

Ops Ninjas

Perl Kung-Fu Artists

PHP Ninjas

Database Samurai

http://omniti.com/is/hiring

L-0

El Fin