introduction of pg_statsinfo and pg_stats_reporter ~statistics reporting tool for dba~
DESCRIPTION
How do you analyze the database performance and sign of trouble database? PostgreSQL provides many useful statistcs and DB activities via system views and cotrib modules. But it is difficult to understand detail information and see all of the database condition. Pg_statsinfo and pg_stats_reporter which were made by NTT Corporation are statistics reporting tools as open source software for DBA. They provide more useful statistics information and visual reporting. In this session, I introduce architecture, installation and use case about these tools.TRANSCRIPT
Copyright(c)2013 NTT Corp. All Rights Reserved.
Introduction of pg_̲statsinfo and pg_̲stats_̲reporter
~∼ Statistics Reporting Tool for DBA ~∼
NTT Open Source Software CenterMitsumasa KONDO
2 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Official Company Name • Nippon Telegraph and Telephone Corporation
• My Belonging • Service innovation Laboratory, Software Innovation Center Researcher
• My work • Middleware development for PostgreSQL
• pg_statsinfo, pg_stats_reporter • High Availability PostgreSQL Cluster using replication with Pacemaker
• PostgreSQL community development • Improvement of disk IO bottle neck
• Past work • Data mining, Natural Language Processing, Machine Learning, Recommendation, Information Retrieval
• I have already been good at them than databaseJ
• Hobby • Photography • Pure Audio
About Me
3 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo • Monitor and Collect PostgreSQL Statistics and Activities
• pg_stats_reporter • Visualize PostgreSQL Statistics and Activities getting from pg_̲statsinfo
Todayʼ’s Introduction Software
pg_statsinfo
pg_statsinfo
pg_statsinfo Repository Database
Database Statistics
and Activity
DB Server A
DB Server B
DB Server C
Sample report which was created by pg_stats_reporter
Creating report
Store of DBstatistics
pg_stats_reporter
4 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
5 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
6 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Monitoring and Collecting PostgreSQL Statistics and Activities • Collecting statistics and activities• All tables in pg_̲catalog schema• pg_̲log information• OS resources
• Other Features• Create Report by command line• Alert and Monitoring function• Log management function• Auto repositoryDB management
• Other relative information • BSD License• Latest version is 2.5.0• http://pgfoundry.org/frs/?group_̲id=1000422• Working on PostgreSQL 9.3!• Web online manual is here• http://pgstatsinfo.projects.pgfoundry.org/pg_̲statsinfo-‐‑‒ja.html
What is pg_̲statsinfo ?
Collective Database Statistics
7 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Programing Language • C
• Starting and Pre-‐Setting method • Start pg_̲statsinfo via shared_̲preload_̲library• Add postgresql.conf to pg_̲statsinfo configuration, then it can start normally in PostgreSQL.
• System Configuration • Install pg_̲statsinfo in monitoring instance
• Not need to install in repository database instance• Monitoring instance and repository database can set together incetance
Architecture of pg_̲statsinfo
pg_statsinfod
Collect and send database statistics
(Snapshot)
Monitoring instance
Repository database
pg_catalog
OS resources
pg_log Statistics ofdatabase
8 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Collect statistics and activities in PostgreSQL • All information gathering PostgreSQLʼ’s statistics collector (ex. pg_̲catalog)
• Detail of statistics collector, please see PostgreSQL documentJ
• http://www.postgresql.jp /document/9.2/html/monitoring-‐‑‒stats.html• Get statistcs as snapshot at uniformity time
• Default every 10 minute
• Analyze pg_̲log and get activities from logs• Get activities which only output pg_̲log
• Checkpoint activities• VACUUM activities
• Get OS resources information in /proc• Get every 5 seconds in sampling, when get snapshot, insert average values of sampling
• CPU usage information(idle, iowait, system, user, Load Average)• Memory usage information(memfree, buffers, cached, swap, dirty)• Disk usage information(IO size, IO time, usage size of disk)
Features of pg_̲statsinfo 1/5
9 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Create reports on command line • Output text format report on command line
• Example) Database admin or SQL Engineer who wants to see database statistics
• Cover almost all report item created by pg_̲stats_̲reporter
Features of pg_̲statsinfo 2/5
$ pg_statsinfo -U postgres -B 2013-10-01 -r ALL | less
Command example: Create report for all monitor instances on 2013-10-1 to now
10 Copyright(c)2013 NTT Corp. All Rights Reserved.
Features of pg_̲statsinfo 2/5
11 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Auto maintenance repository database feature • Delete statistics that stored in repository database automatically
• Pg_̲statsinfo stored data that are used partitioning method per day. • So it can use TRUNCATE to delete old data• Delete data is faster and lower cost
• Note • When we use in multi monitor instance, giving priority to shortest maintenance period of stored data configuration
Features of pg_̲statsinfo 3/5
pg_statsinfo
pg_statsinfo
Maintenance period of stored data config
1 week
Maintenance period of stored data config
2 weeks
DB server A
DB server B
Store of database statistics
Default maintenance period of stored data is 1 weeks
Get and Send database statistics
Repository database
12 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Log management feature • Easy to manage PostgreSQLʼ’s log• Log filtering feature
• Can set log level in pg_̲statsinfo, it means that we can having two log level• example)PostgreSQLʼ’s log level is lower setting to save detail information, and pg_̲statsinfo log level is higher setting to easy to read in daily
• This feature can fix log file name(ex. postgresql.log) It can use in monitoring log software.
• Multi output log feature• Can output syslog and pg_̲log
• Change log level feature• If you want to change log level in especially log message, we can change it
• ex)change log level INFO to LOG in especially log message
• Log compression and managing feature• Compress old logs and manage automatically
Features of pg_̲statsinfo 4/5
pg_statsinfod
pg_log(csv format)
Log by statsinfo(postgresql.log)
log formulation
Flow of extraction statistics from pg_log
13 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Alert and Monitoring Function (Trigger Function) • Output alert log when over the alert thresholds in database
• usage)monitor alert log by monitoring software• Alert function is executed in every snapshot
• Default setting is under following, set property value on your server
• Setting method is UPDATE SQL for statsrepo.alert table
Features of pg_̲statsinfo 5/5
colum name default explanation
instid - Target instance ID rollback_tps 100 Number of rollback (sec)commit_tps 1000 Number of commit per seconds (sec)
garbage_size 20000 Garbage records size in the table(%) garbage_percent 30 Garbage records percentage in the database(%)
garbage_percent_table 30 Garbage records percentage in the table(%)
response_avg 10 average response time in the query (sec) response_worst 60 Worst response time in the query (sec)
enable_alert true Enable alert function
Alert configuration table
14 Copyright(c)2013 NTT Corp. All Rights Reserved.
How to install pg_̲statsinfo ?
$ su# rpm –ivh pg_statsinfo-2.50-1.pg93.rhel6.x86_64.rpm
1. Install RPM file’s
#minimum configurationshared_preload_libraries = ‘pg_statsinfo’ # pre-load library settinglog_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # configuration of log file’s (must need)
2. Add configuration to postgresql.conf
$ pg_ctl –D data start
3. Start PostgreSQL in normally
server startingLOG: loaded library "pg_statsinfo"LOG: pg_statsinfo launcher started
LOG: startLOG: installing schema: statsinfo
LOG: installing schema: statsrepo_partition
4. If we see under following log messages, install was succeed !
How to install pg_statsinfo is indicated in Web manual ! Jhttp://pgstatsinfo.projects.pgfoundry.org/pg_statsinfo-ja.html#install
15 Copyright(c)2013 NTT Corp. All Rights Reserved.
1.Install
2.Confirmation of Install
3.Collect Database Statistics and Activities (Snapshot)
4.Create Report
Demo of pg_̲statsinfo
16 Copyright(c)2013 NTT Corp. All Rights Reserved.
• One snapshot size is 300kB ~ 800kB• Be careful disk full by snapshots!
• Software installing degradation is almost nothing • But little bit happen. In DBT-‐‑‒2 benchmark, we confirm 2% degradation.
• If you’d like to separate repository server, set “pg_statsinfo.repository_server” in postgresql.conf .
• Default setting is ʻ‘host=localhost port=5432ʼ’
• If you use password in repository database, set /var/lib/pgsql/.pgpass
• pg_̲statsinfo works on postgres user
TIPS of pg_̲statsinfo
17 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
18 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Visualization PostgreSQL statistics and activities getting from pg_statsinfo
• Report items• Transaction situation• Size of Database• OS resources• Amount of WAL output• Replication state• Deadlock information
• Successor software of pg_̲reporter
• Extra information • BSD License• Latest version is 2.0.0• http://pgfoundry.org/frs/?group_̲id=1000422• Detail online manual is here• http://pgstatsinfo.projects.pgfoundry.org/pg_̲stats_̲reporter-‐‑‒ja.html
What is pg_̲stats_̲reporter ?
Report of pg_stats_reporter
19 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Software • Apache + PHP + PostgreSQL
• Only PHP + PostgreSQL combination is OK• Need PostgreSQL 8.3 later
• Programing Language • PHP + javascript + SQL
• Using Library • PHP framework
• Smarty• User Interface
• jQuery, jQuery UI, tablesorter, Superfish• Creating graph
• dygraphs, jqPlot
Architecture of pg_̲stats_̲reporter
20 Copyright(c)2013 NTT Corp. All Rights Reserved.
• By Wab Browser • Only a few clicks for creating report.
How to Create Report ? 1/2
② Push “create new
report” button
① Select database instance
for reporting
③ Set term and time of report
21 Copyright(c)2013 NTT Corp. All Rights Reserved.
• By command line • It works on phpʼ’s stand alone mode.
• Usage scene• Create report in command line.• Create reports by crond in regular intervals.
• If you use only command line mode, Apache wasnʼ’t needed
• If you have security policy which cannot install Apache
• Need to save reports in long term• Repository database is saved until certain terms• Created reports arenʼ’t erased.
How to Create Report ? 2/2
$ pg_stats_reporter -B 2013-10-01 -E 2013-10-08 -O report_dir [LOG] Report file created: sample_localhost_5432_1_20131008-1419_20131008-1945.html
Command usage: Create report in 10/1 to 10/8 at report_dir�
22 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Index of Report feature • Create report and index of reports in report directory• It is easy to see and sort out reports
How to Create Report ? 2/2
Index.html
Report HTML 1
Libraly of pg_stats_reporter
Report HTML2
␡␡␡
Directory of Report
Reports which were created past
Index of report
23 Copyright(c)2013 NTT Corp. All Rights Reserved.
How to install pg_̲stats_̲reporter ?
$ su# rpm –ivh httpd-2.2.15-15.el6_2.1.x86_64.rpm \\ php-5.3.3-3.el6_2.8.x86_64.rpm \\
php-common-5.3.3-3.el6_2.8.x86_64.rpm \\ php-pgsql-5.3.3-3.el6_2.8.x86_64.rpm \\
php-intl-5.3.3-3.el6_2.8.x86_64.rpm \\ pg_stats_reporter-1.0.0-1.el6.noarch.rpm
1. Install pg_stats_reporter RPM and dependency RPMs
# vim /etc/pg_stats_reporter.ini----- configuration of repository database ----- host = localhost
port = 5432dbname = postgres
username = postgrespassword =
2. Set pg_stats_reporter.ini(configuration file) (default setting is under following)
# service httpd start
3. Start Apache HTTP server
4. Access under following URL
http://localhost/pg_stats_reporter/pg_stats_reporter.php
How to install pg_stats_reporter is indicated in Web manual ! Jhttp://pgstatsinfo.projects.pgfoundry.org/pg_stats_reporter-ja.html#install
Please set SELINUX disable!!
24 Copyright(c)2013 NTT Corp. All Rights Reserved.
1.Install
2.Confirmation of Install
3.Create Report
Demo of pg_̲stats_̲reporter
25 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Android and iPad are ready
• It is based on jQueryUI library, so we can easy to change interface design (mostly color)
• Logo picture can be also changed with file replaced
• It can select report items on reports • If weʼ’d like to, set /etc /pg_̲stats_̲reporter.ini with your needed report item
• For Security • We can use .httpaccess• Apacheʼ’s security technic can use in same
TIPS of pg_̲stats_̲reporter
26 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
27 Copyright(c)2013 NTT Corp. All Rights Reserved.
• TPC-‐C benchmark software that developed by Open Source Development Labs(OSDL)
• Shopping simulation in parts wholesaler• http://www.tpc.org/tpcc /
• Benchmark score is calculated by only response in uniformity time
• Response time is very important!• IO bottle-‐‑‒neck benchmark
• Mainly benchmark parameter • warehouse
• Database size parameter• Increase one hundred thousands record per adding 1 parameter• Mainly used coordination size of database
• TPW• Transaction per warehouse
• Prepared clients corresponding warehouse size, Default 10• If we set lower TPW, it will be CPU bottle-‐‑‒necked benchmark
What is DBT-‐‑‒2?
28 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Mainly bottle-‐neck • Random read/write
• Almost SQL plans are index scan• Random read/write performance and cache or buffer replace performance are important
• Parallel execution performance is also important• PostgreSQL is better than other RDBMSJ
• Other features • Plan of SQLs are very simple
• Most of SQLs are only index scan access. • Exist ideal Benchmark score
• If DB response all transactions in limit time, it is be ideal score
• Limit of performance is memory 2x equals database size.
• Amount of WAL output is less than pgbench, WAL is not bottle-‐‑‒neck.
Transaction Tendency in DBT-‐‑‒2
29 Copyright(c)2013 NTT Corp. All Rights Reserved.
Test Server and Settings of postgresql.confServer HP DL360 G7
CPU Xeon E5640 2.66GHz (1P/4C)
Memory DDR3-10600R-9 18GB
RAID card P410i / 256MB cache
Disk 4 x 146GB(1.5krpm) RAID 1 + 0
max_connections = 300 shared_buffers = 2458MB work_mem = 1MB maintenance_work_mem = 64MB fsync = on wal_sync_method = fdatasync full_page_writes = on wal_buffers = -1 archive_mode = on
checkpoint_segments = 300 checkpoint_timeout = 15min checkpoint_completion_target = 0.7 random_page_cost = 2.0 effective_cache_size = 9GB default_statistics_target = 10 log_destination = 'syslog’ autovacuum = on
postgresql.conf (mainly changed parameter)
Wherehouse size = 320(database size is about 40GB) and TPW = 10
30 Copyright(c)2013 NTT Corp. All Rights Reserved.
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 1/5
• Transaction Situation • It was seen fluctuates transactions. It is because some benchmark
specifications and some implementation dependent in PostgreSQL• Lower performance in executing CHECKPOINT• CHECKPOINT was mainly caused by checkpoint_̲timeout
• postgresql.conf sets checkpoint_̲timeout = 15min and checkpoint_̲segments = 300
31 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Amount of WAL output • Output 4.6GB WAL in data load to benchmark finished• In data load, Maximum WAL speed is 54MB/sec• In executing benchmark test, Maximum WAL speed is 12MB/sec
• When starting CHECKPOINT, WAL Speed is higher, it is because “full page write”.
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 2/5
32 Copyright(c)2013 NTT Corp. All Rights Reserved.
• CPU usage • Iowait is most, next is idle (It indicates IO bottle-‐‑‒neck situation.)• Part of final CHECKPOINT causes high Load Average
• It is because executing ugly consecutive fsync().• PostgreSQL CHECKPOINT logic is not goodL
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 3/5
33 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Update and heavily access Tables • HOT(Heap on Tuple) is good working!• order_̲line table and stock table have many access• Each tableʼ’s Cache hit rate are very high, but… (Is it really?L)
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 4/5
34 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Query executed situation • Queries which have complicated filter phrase is slow• Unexpected, COMMIT assumes long time!
• It is because long transaction COMMIT needs lot of WAL (WAL buffer writing)
• Final CHECKPOINT fsync() phase makes queries slower
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 5/5
35 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Use direct_cp in archive copy command • When we use archive mode in PostgreSQL, cp command consume large amount of waste file cache, and it is caused lower performance
• BSD License Software• http://directcp.projects.pgfoundry.org/index.html
• Use SSD • In general, database bottle-‐‑‒neck is random access. SSD has 10 times faster random access than MD
• If you need large disk or donʼ’t have cost, you may use tablespace in only hot table, it is very efficiency.
• Use large RAID cache card • PostgreSQL CHECKPOINT does not consider fsync() schedule at all. It is caused very heavy disk write and fail overL
• If you use large raid cache card, it may prevent a little.
For More Performance
36 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo • Monitor and Collect PostgreSQL Statistics and Activities with time series
• BSD License• http://pgstatsinfo.projects.pgfoundry.org/pg_̲statsinfo-‐‑‒ja.html
• Collect whole of statistics an activities for DB admin needed• If youʼ’d like to another new report, Create reporting SQL from collecting information
• pg_stats_reporter • Visualize PostgreSQL Statistics and Activities that are collected by pg_̲statsinfo
• BSD License• http://pgstatsinfo.projects.pgfoundry.org/pg_̲stats_̲reporter-‐‑‒ja.html
• jQuery Based Useful Interface• Report index feature is also useful
• It is easy to improve software, because it is created by PHP + JavaScript
• It is also easy to submit patchJ
Summary