commone mysql

Upload: javier

Post on 02-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Commone Mysql

    1/28

    Scaling MySQL

    A Case Study of Hyperic HQA Case Study of Hyperic HQ

    Scott Feldstein, Senior Software Engineer

    S297257

    Speakers logo here

    (optional)

  • 7/27/2019 Commone Mysql

    2/28

    22008 CommunityOne Conference | developers.sun.com/events/communityone |

    What is Hyperic HQ?

    Hyperic HQ provides a singleremote console that allowsoperation teams to trackperformance/event data, create

    complex alerts and escalations,run diagnostics, and issue controlactions.

  • 7/27/2019 Commone Mysql

    3/28

    32008 CommunityOne Conference | developers.sun.com/events/communityone |

    HQ Database Support

    MySQL

    Oracle

    PostgreSQL (current embedded database solution)

  • 7/27/2019 Commone Mysql

    4/28

  • 7/27/2019 Commone Mysql

    5/28

    52008 CommunityOne Conference | developers.sun.com/events/communityone |

    How Much Data

    300 Platforms (300 remote agents collecting data)

    2,100 Servers

    21,000 Services

    468,000 metrics enabled (20 metrics per resource)

    20,000 metric data points per minute (average) 28,800,000 metric data rows per day

    Medium Size Deployment Scenario

    MEASUREMENT_DATA

    MEASUREMENT_IDTIMESTAMPVALUE

    PRIMARY KEY (TIMESTAMP, MEASUREMENT_ID)

  • 7/27/2019 Commone Mysql

    6/28

    62008 CommunityOne Conference | developers.sun.com/events/communityone |

    Metric Data Flow Agent collects data and sends reports to server withmultiple data points

    Server batch inserts metric data points

    If network connection fails, agent continues to collect, butserver backfills unavailable

    When agent reconnects, spooled data overwrite backfilleddata points

    Agent

    Server

  • 7/27/2019 Commone Mysql

    7/2872008 CommunityOne Conference | developers.sun.com/events/communityone |

    MySQL Batch Insert Statement

    Syntax INSERT INTO TABLE (a,b,c) values (0, 0, 0),(1,1,1),(2,2,2),(3,3,3),...,... Extremely fast since there is only one round trip to the

    database for a batch of inserts Only limitation on statement size is determined by server

    configuration variable "max_allowed_packet"

    Other options for increasing insert speed Set unique_checks=0, insert, set unique_checks=1 Set foreign_key_checks=0, insert, set foreign_key_checks=1

  • 7/27/2019 Commone Mysql

    8/2882008 CommunityOne Conference | developers.sun.com/events/communityone |

    INSERT ... ON DUPLICATE KEY

    UPDATE

    Application sensitive to time. In some circumstances,this will result in duplicate data rows (by primary key),and row values have to be updated.

    When batch insert fails, retry batch with INSERT ONDUPLICATE KEY syntax

    Compared to other databases, HQ iteratively updatesfailed rows and attempts batch insert on rest. Retry

    process until batch has completed.

  • 7/27/2019 Commone Mysql

    9/2892008 CommunityOne Conference | developers.sun.com/events/communityone |

    Batch Aggregate Inserter

    Queue metric data from separate agent reports Minimize number of insert statements, connections, and CPU

    load

    Maximize workload efficiency

    Optimal configuration for 700 agents Workers: 3

    BatchSize: 2000

    QueueSize: 4000000 Peak at 2.2 million metric data inserts per minute

  • 7/27/2019 Commone Mysql

    10/28

    102008 CommunityOne Conference | developers.sun.com/events/communityone |

    Data Consolidation

    Lower resolution tables track min, avg, and max

    Table storing all collected data points (most activity)capped at 2 days worth

    Data compression runs hourly

    Inspired by RRDtool - an opensource Round Robin Database to

    store and display time series data

  • 7/27/2019 Commone Mysql

    11/28

    112008 CommunityOne Conference | developers.sun.com/events/communityone |

    Limit Table GrowthMEASUREMENT_DATA

    MEASUREMENT_IDTIMESTAMPVALUE

    Size Limit 2 Days

    MEASUREMENT_DATA_1H

    MEASUREMENT_IDTIMESTAMPVALUEMINMAX

    Size Limit 14 Days

    MEASUREMENT_DATA_6H

    MEASUREMENT_IDTIMESTAMPVALUE

    MINMAX

    Size Limit 31 DaysMEASUREMENT_DATA_1D (limit N years)

  • 7/27/2019 Commone Mysql

    12/28

    122008 CommunityOne Conference | developers.sun.com/events/communityone |

    Software Partitioning

    MEASUREMENT_DATA split into18 tables, representing 9 days (2per day)

    Application calculates which tableto insert into/select from

    Tables truncated after roll-up ratherthan delete rows

  • 7/27/2019 Commone Mysql

    13/28

    132008 CommunityOne Conference | developers.sun.com/events/communityone |

    Truncation vs. Deletion

    Deletion causes contention on rows in table, impactingany concurrent SQL operation

    Truncation reallocates space for the table object,instead of fragmentation

    Truncation drops and recreates the table - fasteroperation (DDL operation)

  • 7/27/2019 Commone Mysql

    14/28

    142008 CommunityOne Conference | developers.sun.com/events/communityone |

    Indexes Every InnoDB table has a special index called theclustered index (based on primary key) where the

    physical data for the rows is stored

    AdvantagesSelects faster - row data is on the same page where the indexsearch leads

    Inserts in (timestamp) order - avoid page splits and fragmentation

    Fewer Indexes - less space, less maintenance overhead

  • 7/27/2019 Commone Mysql

    15/28

  • 7/27/2019 Commone Mysql

    16/28

    162008 CommunityOne Conference | developers.sun.com/events/communityone |

    Clustered Index InnoDB by default creates

    clustered index based onthe primary key

    Physically ordered pages on

    disk Selects have advantage of

    fewer I/O operations

  • 7/27/2019 Commone Mysql

    17/28

    172008 CommunityOne Conference | developers.sun.com/events/communityone |

    Anatomy of a Data Query

    Query to select aggregate data from metric tables for aspecific metric

    EAM_MEASUREMENT_DATA is a union view of allmetric tables

    SELECT begin AS timestamp,AVG(value) AS value, MAX(value) AS peak, MIN(value) AS low

    FROM (SELECT 1207631340000 + (2880000 * i) AS beginFROM EAM_MEASUREMENT_DATA,

    EAM_NUMBERS WHERE i < 60) n,WHERE timestamp BETWEEN begin AND begin + 2879999 AND

    measurement_id = 600332GROUP BY begin ORDER BY begin;

  • 7/27/2019 Commone Mysql

    18/28

    182008 CommunityOne Conference | developers.sun.com/events/communityone |

    MySQL View Shortcomings

    Query optimizer does not apply where condition toinner select, causing entire tables to be selectedserially before query condition applied

    Sequential table scan

    Temp table space unreasonably large

    Performance suffers

  • 7/27/2019 Commone Mysql

    19/28

    192008 CommunityOne Conference | developers.sun.com/events/communityone |

    Fewer Tables

    Explicitly select only on tables based on time range

    Where clause still not applied to individual selects inunion, but less data selected

    SELECT begin AS timestamp, AVG(value) AS value, MAX(value) AS peak, MIN(value) AS low

    FROM

    (SELECT * FROM HQ_METRIC_DATA_2D_1S UNION ALL

    SELECT * FROM HQ_METRIC_DATA_2D_0S UNION ALL

    SELECT * FROM HQ_METRIC_DATA_1D_1S UNION ALL

    SELECT * FROM HQ_METRIC_DATA_1D_0S UNION ALL

    SELECT * FROM HQ_METRIC_DATA_0D_1S) EAM_MEASUREMENT_DATA,

    (SELECT 1207631340000 + (2880000 * i) AS begin FROM EAM_NUMBERS WHERE i < 60) nWHERE timestamp BETWEEN begin AND begin + 2879999 AND measurement_id = 600332

    GROUP BY begin ORDER BY begin;

  • 7/27/2019 Commone Mysql

    20/28

    202008 CommunityOne Conference | developers.sun.com/events/communityone |

    Best PerformanceSELECT begin AS timestamp, AVG(value) AS value, MAX(value) AS peak, MIN(value) AS low

    FROM(SELECT 1207631340000 + (2880000 * i) AS begin FROM EAM_NUMBERS WHERE i < 60) n,(SELECT * FROM HQ_METRIC_DATA_2D_1SWHERE timestamp between 1207767600000 and 1207804140000 AND

    measurement_id = 600332 UNION ALL

    SELECT * FROM HQ_METRIC_DATA_2D_0S

    WHERE timestamp between 1207724400000 and 1207767599999 AND

    measurement_id = 600332 UNION ALL

    SELECT * FROM HQ_METRIC_DATA_1D_1SWHERE timestamp between 1207681200000 and 1207724399999 AND

    measurement_id = 600332 UNION ALLSELECT * FROM HQ_METRIC_DATA_1D_0S

    WHERE timestamp between 1207638000000 and 1207681199999 AND

    measurement_id = 600332 UNION ALL

    SELECT * FROM HQ_METRIC_DATA_0D_1S

    WHERE timestamp between 1207631340000 and 1207637999999 ANDmeasurement_id = 600332) EAM_MEASUREMENT_DATAWHERE timestamp BETWEEN begin AND begin + 2879999 AND measurement_id = 600332GROUP BY begin ORDER BY begin;

  • 7/27/2019 Commone Mysql

    21/28

    2008 CommunityOne Conference | developers.sun.com/events/communityone |

    ID Generator Requirements

    Rows need to be populated in schema initializationwith hard-coded IDs

    Start sequential IDs at 10001 to reserve space forhard-coded IDs

    MySQLs auto -incrementing does not allow either

    S T bl d F ti

  • 7/27/2019 Commone Mysql

    22/28

    2008 CommunityOne Conference | developers.sun.com/events/communityone |

    Sequences Table and Function

    CREATE TABLE

    `hq_sequence` (

    `seq_name` char(50)

    NOT NULL PRIMARY KEY,

    `seq_val` int(11)

    DEFAULT NULL

    );

    CREATE FUNCTION nextseqval(iname CHAR(50))

    RETURNS INT

    DETERMINISTIC

    BEGIN

    SET @new_seq_val = 0;

    UPDATE hq_sequence set seq_val= @new_seq_val:=seq_val+1

    WHERE seq_name=iname;

    RETURN @new_seq_val;

    END;

  • 7/27/2019 Commone Mysql

    23/28

    232008 CommunityOne Conference | developers.sun.com/events/communityone |

    Using Sequences in MySQL

    Original Solution - InnoDB Sequence Table results inlock timeout and deadlock issues from contention

    Buffer Using In-Memory/Heap Table - locking issues

  • 7/27/2019 Commone Mysql

    24/28

    242008 CommunityOne Conference | developers.sun.com/events/communityone |

    MyISAM Sequence Table

    Change HQ_SEQUENCE to MyISAM rather thanInnoDB

    MyISAM - non-transactional database table

    Inconsistent state resulting from server crashes

  • 7/27/2019 Commone Mysql

    25/28

    252008 CommunityOne Conference | developers.sun.com/events/communityone |

    Hibernate Hi-Lo

    Hibernate Hi-Lo sequence generator Back to using HQ_SEQUENCE with InnoDB

    Hibernate buffers in memory a block of 100 IDs (Low value)

    and increment when reaches High value Uses separate connection that does not participate in

    transactions

    Big performance benefit

    PostgreSQL & Oracle use native sequence generators, somore roundtrips to database

    HQ startup time cut down up to 30% (1 min)

  • 7/27/2019 Commone Mysql

    26/28

    262008 CommunityOne Conference | developers.sun.com/events/communityone |

    Performance Statistics HQ Hardware

    2 Quad Core 2 GHz CPUs, 16 GB RAM, 4GB JVM Heap

    MySQL Hardware 2 Quad Core 1.6 GHz CPUs, 8 GB RAM, 4.5 GB InnoDB

    Buffer Pool Both on CentOS 5.x

    Sustained Load Between 200,000 - 300,000 metrics / min, peaked at 2.2

    million metrics / min

    Load Avg HQ ~ 2, Peaked at 8

    MySQL ~ 1.5, Peaked at 2.5

    CPU Usage HQ and MySQL 10 - 20 %

  • 7/27/2019 Commone Mysql

    27/28

    272008 CommunityOne Conference | developers.sun.com/events/communityone |

    Recommended Server Options

    innodb_buffer_pool_size

    innodb_flush_log_at_trx_commit

    tmp_table_size, max_heap_table_size, andmax_tmp_tables

    innodb_flush_method

    query_cache_size

    More information at http://support.hyperic.com

    http://support.hyperic.com/http://support.hyperic.com/http://support.hyperic.com/
  • 7/27/2019 Commone Mysql

    28/28

    Scaling MySQL

    Scott Feldstein

    S297257