scaling postgresql pgday eu 2009

Upload: gavin-m-roy

Post on 30-May-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    1/29

    Scaling PostgreSQL Under Fire

    Gavin M. Roy

    Chief Technology Officer

    myYearbook.com

    pgDay Europe 2009

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    2/29

    About myYearbook.com

    2007 - 100M Page views per Month

    2009 - 1.5B Page views per Month

    Top 5 Social Network in the United States as measured by

    Hitwise

    Top 25 trafficked site in the United States as measured byComScore

    99% Uptime

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    3/29

    The 1am Phone Call

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    4/29

    Growth is a double-edged sword.

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    5/29

    Database Project Considerations

    Good Schema

    Is it designed to Scale?

    Is it designed to be Maintained?

    Good Hardware

    Will you have enough space for unexpected growth?

    Will it be fast enough to handled additional load?

    Will it be stable in production under load?

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    6/29

    Database Project Considerations

    Good Planning

    What will you do when there is a failure?

    How long will it take to recover?

    What kind of failures will you have?

    How will you handle upgrades and downtime?

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    7/29

    Internet Startup Growth Cycle

    1. Prototype

    2. Launch

    3. Re-Engineer (Fix problems)

    4. Add new functionality

    5. Repeat Steps 3 and 4

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    8/29

    Internet Startup Growth Cycle

    Steps 1 & 2

    Limited Budget

    Limited Time & Resources

    Steps 3 & 4

    Increased Budget

    Limited Time & Resources

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    9/29

    The best laid schemes o mice an men

    Gang aft agley

    - Robert Burns, To a Mouse

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    10/29

    Prepare for Growth

    Hardware

    CPU Horsepower based upon need

    Disk based upon need

    RAM based upon budget.

    Get 2

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    11/29

    The Cloud

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    12/29

    Plan for Growth Concurrency & Data Growth

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    13/29

    Concurrency in PostgreSQL

    Each running PostgreSQL connection carries overhead

    More connections == Slower queries

    Pool your connections

    pgBouncer - light weight libevent based pooling daemon

    pgPool II - Does pooling and much, much more

    Language specific pooling

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    14/29

    Table Partitioning

    Supported in PostgreSQL as of 8.1

    Excellent method for maintaining data

    Allows for removal of aged data without bloat

    Focused SELECTS while allowing ad-hoc SELECT across all partitions

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    15/29

    Vertical Partitioning Data

    Isolate application data in different database servers

    Replicate common data needed for joins

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    16/29

    Horizontally Partitioning Data

    Spread table data across multiple database servers

    Use a CRC or Hashing algorithm to determine server location of data

    Roll your own in your client code

    Use pl/Proxy

    Plan for growth, use multiple server slots per server

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    17/29

    Anything that can possibly go wrong, does.

    - Jack Sack

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    18/29

    Recovering from Failures

    Daily backup is not enough

    Disaster recovery option

    Replicate data for failover and maintenance

    via Replication Tools like Londiste, Bucardo and Slony

    via Warm Standby (PITR Log Shipping)

    Hot standby? Maybe in 8.5

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    19/29

    PostgreSQL is YeSQL Saying no to the naysayers

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    20/29

    No[SQL] is Not an Option*

    Key/Value store databases are not new, theyre just trendy

    Facebook still has to use Memcached even though it developed the

    Cassandra distributed key/value database.

    Same type of developers who jumped on MySQL because it was fast

    Data without schema has limited use

    Ad-hoc reporting is a key business value that is realistically unachievable in

    non-structured data

    Find the balance of speed and simplicity with normalization

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    21/29

    How to kill PostgreSQL performance.

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    22/29

    Inadequate Hardware

    Not enough RAM

    Not enough disk bandwidth

    Depending on topology

    Not enough disks

    Not enough controllers

    Slow communication bus

    Over-saturated CPU

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    23/29

    Lock Contention == Death

    Nothing brings a server to its knees faster than a long running Exclusive Lock

    Even high share count locks go hand-and-hand with slowness

    Reduce lock contention

    Use partitioning schemes

    Use concurrent operations for maintenance related activities

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    24/29

    Bloat == Slower Death

    MVCC is both integral to PostgreSQL performance and impacts performance

    Enter Heap Only Tuples (HOT)

    Updates to non-indexed columns in rows can re-use the same tuple

    Dead tuples can be re-used

    Index Bloat

    Massive index bloat can occur in high-write transaction databases

    Address by concurrently reindex tables without locking*

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    25/29

    Knowledge is Power

    - Sir Francis Bacon

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    26/29

    Reacting can be Proactive

    check_postgres.pl by Greg Mullane

    Nagios plugin

    Bloat

    Management activity such as last analyze and vacuum

    wal file count, txid wrap around, sequence exhaustion

    many other items it checks

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    27/29

    Be Trendy

    Know your database behaviorover time

    Predict future issues andbehavior

    Identify issues as they occur

    Review impact of maintenance

    Know if your heap use exceedsyour index use

    Look for daily change

  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    28/29

    Trending and Analysis at myYearbook.com

    Cacti - http://www.cacti.net

    Posuta - http://code.google.com/p/posuta/

    pgFouine - http://pgfouine.projects.postgresql.org/

    Staplr - http://github.com/gmr/staplr

    http://code.google.com/p/posuta/http://code.google.com/p/posuta/http://www.cacti.net/http://www.cacti.net/
  • 8/14/2019 Scaling PostgreSQL PgDay EU 2009

    29/29

    Fin

    Questions?

    Follow me on Twitter: http://twitter.com/Crad

    Leave feedback: http://2009.pgday.eu/feedback

    http://2009.pgday.eu/feedbackhttp://2009.pgday.eu/feedbackhttp://twitter.com/Cradhttp://twitter.com/Crad