introduction vauum, freezing, xid wraparound

42
Copyright © 2016 NTT DATA Corporation 03/17/2016 NTT DATA Corporation Masahiko Sawada Introduction VACUUM, FREEZING, XID wraparound

Upload: masahiko-sawada

Post on 23-Jan-2018

780 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation

03/17/2016 NTT DATA Corporation Masahiko Sawada

Introduction VACUUM, FREEZING, XID wraparound

Page 2: Introduction VAUUM, Freezing, XID wraparound

2 Copyright © 2016NTT DATA Corporation

A little about me

Ø  Masahiko Sawada Ø  twitter : @sawada_masahiko

Ø  NTT DATA Corporation Ø  Database engineer

Ø  PostgreSQL Hacker Ø  Core feature Ø  pg_bigm (Multi-byte full text search module for PostgreSQL)

Page 3: Introduction VAUUM, Freezing, XID wraparound

3 Copyright © 2016NTT DATA Corporation

Contents

•  VACUUM

•  Visibility Map

•  Freezing Tuple

•  XID wraparound

•  New VACUUM feature for 9.6

Page 4: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 4

What is the VACUUM?

Page 5: Introduction VAUUM, Freezing, XID wraparound

5 Copyright © 2016 NTT DATA Corporation

VACUUM

1 AAA

2 BBB

3 CCC

2 bbb

4 DDD Concurrently INSERT/DELETE/UPDATE

1 AAA

2 BBB

3 CCC

2 bbb

1 AAA

3 CCC

2 bbb

4 DDD

VACUUM Starts

VACUUM Done FSM

UPDATE : BBB->bbb

•  Postgres garbage collection feature

•  Acquire ShareUpdateExclusive Lock

Page 6: Introduction VAUUM, Freezing, XID wraparound

6 Copyright © 2016 NTT DATA Corporation

Why do we need to VACUUM?

•  Recover or reuse disk space occupied

•  Update data statistics

•  Update visibility map to speed up Index-Only Scan.

•  Protect against loss of very old data due to XID wraparound

Page 7: Introduction VAUUM, Freezing, XID wraparound

7 Copyright © 2016 NTT DATA Corporation

Evolution history of VACUUM

v8.1 (2005) v8.4 (2009)

autovacuum !?

Visibility Map Free Space Map

v9.5 (2016)

vacuumdb parallel option

v9.6

Page 8: Introduction VAUUM, Freezing, XID wraparound

8 Copyright © 2016 NTT DATA Corporation

VACUUM Syntax

-- VACUUM whole database =# VACUUM;

-- Multiple option, analyzing only col1 column

=# VACUUM FREEZE VERBOSE ANALYZE hoge (col1);

-- Multiple option with parentheses

=# VACUUM (FULL, ANALYZE, VERBOSE) hoge;

Page 9: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 9

Visibility Map

Page 10: Introduction VAUUM, Freezing, XID wraparound

10 Copyright © 2016 NTT DATA Corporation

Visibility Map

•  Introduced at 8.4 •  A bit map for each table (1 bit per 1 page) •  A table relation can have a visibility map. •  keep track of which pages are all-visible page

•  keep track of which pages are having garbage. •  If 500GB table, Visibility Map is less than 10MB.

Table (base/XXX/1234)

Visibility Map (base/XXX/1234_vm) Block 0

Block 1 Block 2 Block 3 Block 4

11001…

Page 11: Introduction VAUUM, Freezing, XID wraparound

11 Copyright © 2016 NTT DATA Corporation

State transition of Visibility Map bit

VACUUM

0 1

INSERT, UPDATE, DELETE

(NOT all-visible) (all-visible)

Page 12: Introduction VAUUM, Freezing, XID wraparound

12 Copyright © 2016 NTT DATA Corporation

How does the VACUUM works actually?

•  VACUUM works with two phases;

1.  Scan table to collect TID

2.  Reclaim garbage (Table, Index)

maintenance_work_mem

Index

Table

Scan Table

Collect garbage TID

Reclaim garbages

1st Phase

2nd Phase

Page 13: Introduction VAUUM, Freezing, XID wraparound

13 Copyright © 2016 NTT DATA Corporation

Performance improvement point of VACUUM

•  Scan table page one by one.

•  vacuum can skip, iff there are more than 32 consecutive all-visible pages

•  Store and remember garbage tuple ID to maintenance_work_mem.

VACUUM can skip to scan efficiency.

SLOW!! FAST!

VACUUM needs to scan all page.

: all-visible block

: Not all-visible block

Page 14: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 14

XID wraparound and freezing tuple

Page 15: Introduction VAUUM, Freezing, XID wraparound

15 Copyright © 2016 NTT DATA Corporation

What is the transaction ID (XID)?

•  Every tuple has two transaction IDs. •  xmin : Inserted XID •  xmax : Deleted/Updated XID

xmin | xmax | col -------+------+------ 1810 | 1820 | AAA 1812 | 0 | BBB 1814 | 1830 | CCC 1820 | 0 | XXX

In REPEATABLE READ transaction isolation level, •  Transaction 1815 can see ‘AAA’, ‘BBB’ and ‘CCC’. •  Transaction 1821 can see ‘BBB’, ‘CCC’ and ‘XXX’ •  Transaction 1831 can see ‘BBB’ and ‘XXX’.

Page 16: Introduction VAUUM, Freezing, XID wraparound

16 Copyright © 2016 NTT DATA Corporation

What is the transaction ID (XID)?

•  Can represent up to 4 billion transactions (uint32).

•  XID space is circular with no endpoint.

•  There are 2 billion XIDs that are “older”, 2 billion XIDs that are “newer”.

0 232-1

Older (Not visible)

Newer (Visible)

Page 17: Introduction VAUUM, Freezing, XID wraparound

17 Copyright © 2016 NTT DATA Corporation

What is the XID wraparound?

XID=100 XID=100

XID 100 become not visible

XID=100

Older (Visible)

Newer (Not visible)

XID 100 is visible

Older (Not visible) Older

(Not visible)

Newer (Visible)

Newer (Visible)

Still visible

•  Postgres could loss the very old data due to XID wraparound.

•  When tuple is more than 2 billion transaction old, it could be happen.

•  If 200 TPS system, it’s happen every 120 days.

•  Note that it could be happen on INSERT-only table.

Page 18: Introduction VAUUM, Freezing, XID wraparound

18 Copyright © 2016 NTT DATA Corporation

Freezing tuple

•  Mark tuple as “Frozen”

•  Marking “frozen” means that it will appear to be “in the past” to all transaction.

•  Must freeze old tuple *before* XID proceeds 2 billion.

XID=100 (FREEZE)

XID=100 (FREEZE)

Tuple is visible.

XID=100

Older (Visible)

Newer (Not visible)

XID 100 is visible

Older (Not visible) Older

(Not visible)

Newer (Visible)

Newer (Visible)

Still visible. Tuple is marked as ‘FREEZE’

Page 19: Introduction VAUUM, Freezing, XID wraparound

19 Copyright © 2016 NTT DATA Corporation

To prevent old data loss due to XID wraparound

•  Emit WARNING log at 10 million transactions remaining.

•  Prohibit to generate new XID at 1 million transactions remaining.

•  Run anti-wraparound VACUUM automatically.

Page 20: Introduction VAUUM, Freezing, XID wraparound

20 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

•  All table has pg_class.relfrozenxid value. •  All tuples which had been inserted by XID older than relfrozenxid have been

marked as “Frozen”. •  Same as forcibly executed VACUUM *FREEZE*.

Current XID pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

VACUUM could do a whole table scan

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

Page 21: Introduction VAUUM, Freezing, XID wraparound

21 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

At this XID, lazy VACUUM is executed.

Current XID pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

VACUUM could do a whole table scan

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

VACUUM

Page 22: Introduction VAUUM, Freezing, XID wraparound

22 Copyright © 2016 NTT DATA Corporation

VACUUM could do a whole table scan

Anti-wraparound VACUUM

If you execute VACUUM at this XID, anti-wraparound VACUUM will be

executed.

If you do VACUUM at this XID, anti-wraparound VACUUM is executed.

pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

anti-wraparound VACUUM

Current XID

Page 23: Introduction VAUUM, Freezing, XID wraparound

23 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

After current XID is exceeded, anti-wraparound VACUUM is launched forcibly by autovacuum.

pg_class. relfrozenxid

anti-wraparound VACUUM is

launched forcibly

autovacuum_max_freeze_age (default 200 million)

+ 2 billion

vacuum_freeze_table_age (default 150 million)

XID wraparound

anti-wraparound auto VACUUM

Current XID

VACUUM could do a whole table scan

Page 24: Introduction VAUUM, Freezing, XID wraparound

24 Copyright © 2016 NTT DATA Corporation

Anti-wraparound VACUUM

After anti-wraparound VACUUM, relrozenxid value is updated.

Current XID pg_class. relfrozenxid

vacuum_freeze_min_age (default 50 million)

Page 25: Introduction VAUUM, Freezing, XID wraparound

25 Copyright © 2016 NTT DATA Corporation

anti-wraparound VACUUM is too slow

•  Scanning whole table is always required to proceed relfrozenxid.

•  Because lazy vacuum could skip page having the visible but not frozen tuple.

Visibility Map

Block # xmin

0 0 FREEZE FREEZE

1 1 FREEZE FREEZE

1 2 101

102

103

0 3 Garbage

104

Normal VACUUM

Anti-wraparound VACUUM

Page 26: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 26

How can we improve anti-wraparound VACUUM?

Page 27: Introduction VAUUM, Freezing, XID wraparound

27 Copyright © 2016 NTT DATA Corporation

Approaches

•  Freeze Map

•  Track pages which are necessary to be frozen.

•  64bit XID

•  Change size of XID from 32bit to 64bit.

•  LSN to XID map

•  Mapping XID to LSN.

Page 28: Introduction VAUUM, Freezing, XID wraparound

28 Copyright © 2016 NTT DATA Corporation

Freeze Map

•  New feature for 9.6.

•  Improve VACUUM FREEZE, anti-wraparound VACUUM performance.

•  Bring us to functionality for VLDB.

Page 29: Introduction VAUUM, Freezing, XID wraparound

29 Copyright © 2016 NTT DATA Corporation

Idea - Add an additional bit

•  Not adding new map.

•  Add a additional bit to Visibility Map.

•  The additional bits tracks which pages are all-frozen.

•  All-frozen page should be all-visible as well.

10110010 all-visible all-frozen

Page 30: Introduction VAUUM, Freezing, XID wraparound

30 Copyright © 2016 NTT DATA Corporation

State transition of two bits

00

10 11

all-visible all-frozen

VACUUM UPDATE/ DELETE/ INSERT

UPDATE/ DELETE/ INSERT

VACUUM FREEZE

VACUUM FREEZE

Page 31: Introduction VAUUM, Freezing, XID wraparound

31 Copyright © 2016 NTT DATA Corporation

Idea - Improve anti-wraparound performance

•  VACUUM can skip all-frozen page even if anti-wraparound VACUUM is

required.

Normal VACUUM

Anti-wraparound VACUUM

Visiblity Map Block # xmin

visible frozen

1 0 0 FREEZE FREEZE

1 1 1 FREEZE FREEZE

1 0 2 101

102

103

0 0 3 Garbage

104

Page 32: Introduction VAUUM, Freezing, XID wraparound

32 Copyright © 2016 NTT DATA Corporation

Pros/Cons

•  Pros

•  Dramatically performance improvement for VACUUM FREEZE.

•  Read only table. (future)

•  Cons

•  Bloat Visibility Map size as twice.

Page 33: Introduction VAUUM, Freezing, XID wraparound

33 Copyright © 2016 NTT DATA Corporation

No More Full-Table Vacuums

http://rhaas.blogspot.jp/2016/03/no-more-full-table-vacuums.html#comment-form

Page 34: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2016 NTT DATA Corporation 34

Another work

Page 35: Introduction VAUUM, Freezing, XID wraparound

35 Copyright © 2016 NTT DATA Corporation

Vacuum Progress Checker

•  New feature for 9.6. (under reviewing)

•  Report progress information of VACUUM via system view.

Page 36: Introduction VAUUM, Freezing, XID wraparound

36 Copyright © 2016 NTT DATA Corporation

Idea

•  Add new system view.

•  Report meaningful progress information for detail per process doing VACUUM.

postgres(1)=# SELECT * FROM pg_stat_vacuum_progress ; -[ RECORD 1 ]-------+--------------

pid | 55513

relid | 16384

phase | Scanning Heap

total_heap_blks | 451372

current_heap_blkno | 77729

total_index_pages | 559364

scanned_index_pages | 559364 index_scan_count | 1

percent_complete | 17

Page 37: Introduction VAUUM, Freezing, XID wraparound

37 Copyright © 2016 NTT DATA Corporation

Future works

•  Read Only Table

•  Report progress information of other maintenance command.

Page 38: Introduction VAUUM, Freezing, XID wraparound

Copyright © 2011 NTT DATA Corporation

Copyright © 2016 NTT DATA Corporation

PostgreSQL git repository

git://git.postgresql.org/git/postgresql.git

Page 39: Introduction VAUUM, Freezing, XID wraparound

39 Copyright © 2016 NTT DATA Corporation

VERBOSE option

=# VACUUM VERBOSE hoge; INFO: vacuuming "public.hoge"

INFO: scanned index "hoge_idx1" to remove 1000 row versions

DETAIL: CPU 0.00s/0.01u sec elapsed 0.01 sec.

INFO: "hoge": removed 1000 row versions in 443 pages

DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: index "hoge_idx1" now contains 100000 row versions in 276 pages DETAIL: 1000 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: "hoge": found 1000 removable, 100000 nonremovable row versions in 447 out of 447 pages DETAIL: 0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

Skipped 0 pages due to buffer pins.

0 pages are entirely empty.

CPU 0.00s/0.05u sec elapsed 0.05 sec.

VACUUM

Page 40: Introduction VAUUM, Freezing, XID wraparound

40 Copyright © 2016 NTT DATA Corporation

FREEZE option

•  Aggressive freezing of tuples

•  Same as running normal VACUUM with vacuum_freeze_min_age = 0 and

vacuum_freeze_table_age = 0

•  Always scan whole table

Page 41: Introduction VAUUM, Freezing, XID wraparound

41 Copyright © 2016 NTT DATA Corporation

ANALYZE option

•  Do ANALYZE after VACUUM •  Update data statistics used by planner

-- VACUUM and analyze with VERBOSE option =# VACUUM ANALYZE VERBOSE hoge;

INFO: vacuuming "public.hoge"

:

INFO: analyzing "public.hoge"

INFO: "hoge": scanned 452 of 452 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows

VACUUM

Page 42: Introduction VAUUM, Freezing, XID wraparound

42 Copyright © 2016 NTT DATA Corporation

FULL option

•  Completely different from lazy VACUUM

•  Similar to CLUSTER

•  Acquire AccessExclusiveLock

•  Take much longer than lazy VACUUM

•  Need more space at most twice as table size.

•  Rebuild table and indexes

•  Freeze tuple while VACUUM FULL (9.3~)