introduction vauum, freezing, xid wraparound

Post on 23-Jan-2018

780 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Copyright © 2016 NTT DATA Corporation

03/17/2016 NTT DATA Corporation Masahiko Sawada

Introduction VACUUM, FREEZING, XID wraparound

2 Copyright © 2016NTT DATA Corporation

A little about me

Ø  Masahiko Sawada Ø  twitter : @sawada_masahiko

Ø  NTT DATA Corporation Ø  Database engineer

Ø  PostgreSQL Hacker Ø  Core feature Ø  pg_bigm (Multi-byte full text search module for PostgreSQL)

3 Copyright © 2016NTT DATA Corporation

Contents

•  VACUUM

•  Visibility Map

•  Freezing Tuple

•  XID wraparound

•  New VACUUM feature for 9.6

Copyright © 2016 NTT DATA Corporation 4

What is the VACUUM?

5 Copyright © 2016 NTT DATA Corporation

VACUUM

1 AAA

2 BBB

3 CCC

2 bbb

4 DDD Concurrently INSERT/DELETE/UPDATE

1 AAA

2 BBB

3 CCC

2 bbb

1 AAA

3 CCC

2 bbb

4 DDD

VACUUM Starts

VACUUM Done FSM

UPDATE : BBB->bbb

•  Postgres garbage collection feature

•  Acquire ShareUpdateExclusive Lock

6 Copyright © 2016 NTT DATA Corporation

Why do we need to VACUUM?

•  Recover or reuse disk space occupied

•  Update data statistics

•  Update visibility map to speed up Index-Only Scan.

•  Protect against loss of very old data due to XID wraparound

7 Copyright © 2016 NTT DATA Corporation

Evolution history of VACUUM

v8.1 (2005) v8.4 (2009)

autovacuum !?

Visibility Map Free Space Map

v9.5 (2016)

vacuumdb parallel option

v9.6

8 Copyright © 2016 NTT DATA Corporation

VACUUM Syntax

-- VACUUM whole database =# VACUUM;

-- Multiple option, analyzing only col1 column

=# VACUUM FREEZE VERBOSE ANALYZE hoge (col1);

-- Multiple option with parentheses

=# VACUUM (FULL, ANALYZE, VERBOSE) hoge;

Copyright © 2016 NTT DATA Corporation 9

Visibility Map

10 Copyright © 2016 NTT DATA Corporation

Visibility Map

•  Introduced at 8.4 •  A bit map for each table (1 bit per 1 page) •  A table relation can have a visibility map. •  keep track of which pages are all-visible page

•  keep track of which pages are having garbage. •  If 500GB table, Visibility Map is less than 10MB.

Table (base/XXX/1234)

Visibility Map (base/XXX/1234_vm) Block 0

Block 1 Block 2 Block 3 Block 4

11001…

11 Copyright © 2016 NTT DATA Corporation

State transition of Visibility Map bit

VACUUM

0 1

INSERT, UPDATE, DELETE

(NOT all-visible) (all-visible)

12 Copyright © 2016 NTT DATA Corporation

How does the VACUUM works actually?

•  VACUUM works with two phases;

1.  Scan table to collect TID

2.  Reclaim garbage (Table, Index)

maintenance_work_mem

Index

Table

Scan Table

Collect garbage TID

Reclaim garbages

1st Phase

2nd Phase

13 Copyright © 2016 NTT DATA Corporation

Performance improvement point of VACUUM

•  Scan table page one by one.

•  vacuum can skip, iff there are more than 32 consecutive all-visible pages

•  Store and remember garbage tuple ID to maintenance_work_mem.

VACUUM can skip to scan efficiency.

SLOW!! FAST!

VACUUM needs to scan all page.

: all-visible block

: Not all-visible block

Copyright © 2016 NTT DATA Corporation 14

XID wraparound and freezing tuple

15 Copyright © 2016 NTT DATA Corporation

What is the transaction ID (XID)?

•  Every tuple has two transaction IDs. •  xmin : Inserted XID •  xmax : Deleted/Updated XID

xmin | xmax | col -------+------+------ 1810 | 1820 | AAA 1812 | 0 | BBB 1814 | 1830 | CCC 1820 | 0 | XXX

In REPEATABLE READ transaction isolation level, •  Transaction 1815 can see ‘AAA’, ‘BBB’ and ‘CCC’. •  Transaction 1821 can see ‘BBB’, ‘CCC’ and ‘XXX’ •  Transaction 1831 can see ‘BBB’ and ‘XXX’.

16 Copyright © 2016 NTT DATA Corporation

What is the transaction ID (XID)?

•  Can represent up to 4 billion transactions (uint32).

•  XID space is circular with no endpoint.

•  There are 2 billion XIDs that are “older”, 2 billion XIDs that are “newer”.

0 232-1

Older (Not visible)

Newer (Visible)

17 Copyright © 2016 NTT DATA Corporation

What is the XID wraparound?

XID=100 XID=100

XID 100 become not visible

XID=100

Older (Visible)

Newer (Not visible)

XID 100 is visible

Older (Not visible) Older

(Not visible)

Newer (Visible)

Newer (Visible)

Still visible

•  Postgres could loss the very old data due to XID wraparound.

•  When tuple is more than 2 billion transaction old, it could be happen.

•  If 200 TPS system, it’s happen every 120 days.

•  Note that it could be happen on INSERT-only table.

18 Copyright © 2016 NTT DATA Corporation

Freezing tuple

•  Mark tuple as “Frozen”

•  Marking “frozen” means that it will appear to be “in the past” to all transaction.

•  Must freeze old tuple *before* XID proceeds 2 billion.

XID=100 (FREEZE)

XID=100 (FREEZE)

Tuple is visible.

XID=100

Older (Visible)

Newer (Not visible)

XID 100 is visible

Older (Not visible) Older

(Not visible)

Newer (Visible)

Newer (Visible)

Still visible. Tuple is marked as ‘FREEZE’

19 Copyright © 2016 NTT DATA Corporation

To prevent old data loss due to XID wraparound

•  Emit WARNING log at 10 million transactions remaining.

•  Prohibit to generate new XID at 1 million transactions remaining.

•  Run anti-wraparound VACUUM automatically.

20 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

•  All table has pg_class.relfrozenxid value. •  All tuples which had been inserted by XID older than relfrozenxid have been

marked as “Frozen”. •  Same as forcibly executed VACUUM *FREEZE*.

Current XID pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

VACUUM could do a whole table scan

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

21 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

At this XID, lazy VACUUM is executed.

Current XID pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

VACUUM could do a whole table scan

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

VACUUM

22 Copyright © 2016 NTT DATA Corporation

VACUUM could do a whole table scan

Anti-wraparound VACUUM

If you execute VACUUM at this XID, anti-wraparound VACUUM will be

executed.

If you do VACUUM at this XID, anti-wraparound VACUUM is executed.

pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

anti-wraparound VACUUM

Current XID

23 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

After current XID is exceeded, anti-wraparound VACUUM is launched forcibly by autovacuum.

pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

anti-wraparound auto VACUUM

Current XID

VACUUM could do a whole table scan

24 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

After anti-wraparound VACUUM, relrozenxid value is updated.

Current XID pg_class. relfrozenxid

vacuum_freeze_min_age (default 50 million)

25 Copyright © 2016 NTT DATA Corporation

anti-wraparound VACUUM is too slow

•  Scanning whole table is always required to proceed relfrozenxid.

•  Because lazy vacuum could skip page having the visible but not frozen tuple.

Visibility Map

Block # xmin

0 0 FREEZE FREEZE

1 1 FREEZE FREEZE

1 2 101

102

103

0 3 Garbage

104

Normal VACUUM

Anti-wraparound VACUUM

Copyright © 2016 NTT DATA Corporation 26

How can we improve anti-wraparound VACUUM?

27 Copyright © 2016 NTT DATA Corporation

Approaches

•  Freeze Map

•  Track pages which are necessary to be frozen.

•  64bit XID

•  Change size of XID from 32bit to 64bit.

•  LSN to XID map

•  Mapping XID to LSN.

28 Copyright © 2016 NTT DATA Corporation

Freeze Map

•  New feature for 9.6.

•  Improve VACUUM FREEZE, anti-wraparound VACUUM performance.

•  Bring us to functionality for VLDB.

29 Copyright © 2016 NTT DATA Corporation

Idea - Add an additional bit

•  Not adding new map.

•  Add a additional bit to Visibility Map.

•  The additional bits tracks which pages are all-frozen.

•  All-frozen page should be all-visible as well.

10110010 all-visible all-frozen

30 Copyright © 2016 NTT DATA Corporation

State transition of two bits

00

10 11

all-visible all-frozen

VACUUM UPDATE/ DELETE/ INSERT

UPDATE/ DELETE/ INSERT

VACUUM FREEZE

VACUUM FREEZE

31 Copyright © 2016 NTT DATA Corporation

Idea - Improve anti-wraparound performance

•  VACUUM can skip all-frozen page even if anti-wraparound VACUUM is

required.

Normal VACUUM

Anti-wraparound VACUUM

Visiblity Map Block # xmin

visible frozen

1 0 0 FREEZE FREEZE

1 1 1 FREEZE FREEZE

1 0 2 101

102

103

0 0 3 Garbage

104

32 Copyright © 2016 NTT DATA Corporation

Pros/Cons

•  Pros

•  Dramatically performance improvement for VACUUM FREEZE.

•  Read only table. (future)

•  Cons

•  Bloat Visibility Map size as twice.

33 Copyright © 2016 NTT DATA Corporation

No More Full-Table Vacuums

http://rhaas.blogspot.jp/2016/03/no-more-full-table-vacuums.html#comment-form

Copyright © 2016 NTT DATA Corporation 34

Another work

35 Copyright © 2016 NTT DATA Corporation

Vacuum Progress Checker

•  New feature for 9.6. (under reviewing)

•  Report progress information of VACUUM via system view.

36 Copyright © 2016 NTT DATA Corporation

Idea

•  Add new system view.

•  Report meaningful progress information for detail per process doing VACUUM.

postgres(1)=# SELECT * FROM pg_stat_vacuum_progress ; -[ RECORD 1 ]-------+--------------

pid | 55513

relid | 16384

phase | Scanning Heap

total_heap_blks | 451372

current_heap_blkno | 77729

total_index_pages | 559364

scanned_index_pages | 559364 index_scan_count | 1

percent_complete | 17

37 Copyright © 2016 NTT DATA Corporation

Future works

•  Read Only Table

•  Report progress information of other maintenance command.

Copyright © 2011 NTT DATA Corporation

Copyright © 2016 NTT DATA Corporation

PostgreSQL git repository

git://git.postgresql.org/git/postgresql.git

39 Copyright © 2016 NTT DATA Corporation

VERBOSE option

=# VACUUM VERBOSE hoge; INFO: vacuuming "public.hoge"

INFO: scanned index "hoge_idx1" to remove 1000 row versions

DETAIL: CPU 0.00s/0.01u sec elapsed 0.01 sec.

INFO: "hoge": removed 1000 row versions in 443 pages

DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: index "hoge_idx1" now contains 100000 row versions in 276 pages DETAIL: 1000 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: "hoge": found 1000 removable, 100000 nonremovable row versions in 447 out of 447 pages DETAIL: 0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

Skipped 0 pages due to buffer pins.

0 pages are entirely empty.

CPU 0.00s/0.05u sec elapsed 0.05 sec.

VACUUM

40 Copyright © 2016 NTT DATA Corporation

FREEZE option

•  Aggressive freezing of tuples

•  Same as running normal VACUUM with vacuum_freeze_min_age = 0 and

vacuum_freeze_table_age = 0

•  Always scan whole table

41 Copyright © 2016 NTT DATA Corporation

ANALYZE option

•  Do ANALYZE after VACUUM •  Update data statistics used by planner

-- VACUUM and analyze with VERBOSE option =# VACUUM ANALYZE VERBOSE hoge;

INFO: vacuuming "public.hoge"

:

INFO: analyzing "public.hoge"

INFO: "hoge": scanned 452 of 452 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows

VACUUM

42 Copyright © 2016 NTT DATA Corporation

FULL option

•  Completely different from lazy VACUUM

•  Similar to CLUSTER

•  Acquire AccessExclusiveLock

•  Take much longer than lazy VACUUM

•  Need more space at most twice as table size.

•  Rebuild table and indexes

•  Freeze tuple while VACUUM FULL (9.3~)

top related