postgis i ndizes · 2019-03-23 · postgis > bbox per block 4 . 3. 4 . 4. order by geom order by...

43
PostGIS Indizes PostGIS Indizes Welcher ist der richtige? Felix Kunde F O S S I G 2019 1

Upload: others

Post on 06-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

PostGIS IndizesPostGIS IndizesWelcher ist der richtige?

Felix KundeF O S S

IG

2019

1

Page 2: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

CREATE INDEX pts_spx ON point_table USING GIST (geom)

2

Page 3: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

CREATE INDEX pts_spx ON point_table USING GIST (geom)

BRIN < v2.3< v2.3

2

Page 4: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

CREATE INDEX pts_spx ON point_table USING GIST (geom)

BRIN

SPGIST< v2.3< v2.3

< v2.5< v2.5

2

Page 5: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

GiSTGiST

3 . 1

Page 6: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

It's a frameworkIt's a frameworkPostGIS > R-TreePostGIS > R-Tree

Stores BBoxStores BBox

3 . 2

Page 7: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

3 . 3

Leaves

Page 8: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

3 . 4

Nodes

Page 9: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

3 . 5

Root

Page 10: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Level 1 Level 2

3 . 6

Page 11: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

ST_Intersects in msST_Intersects in ms

tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn

no index 18.00 87.00 670 6473 135529

bulk 0.21 14.00 19 146 1568

online 0.18 15.00 29 163 1672

vacuum 0.16 0.87 18 145 1551

cluster 0.13 0.64 16 32 214

3 . 7

Page 12: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

ST_Intersects in msST_Intersects in ms

tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn

no index 18.00 87.00 670 6473 135529

bulk 0.21 14.00 19 146 1568

online 0.18 15.00 29 163 1672

vacuum 0.16 0.87 18 145 1551

cluster 0.13 0.64 16 32 214

3 . 7

Page 13: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

ST_Intersects in msST_Intersects in ms

tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn

no index 18.00 87.00 670 6473 135529

bulk 0.21 14.00 19 146 1568

online 0.18 15.00 29 163 1672

vacuum 0.16 0.87 18 145 1551

cluster 0.13 0.64 16 32 214

3 . 7

Page 14: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

ST_Intersects in msST_Intersects in ms

tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn

no index 18.00 87.00 670 6473 135529

bulk 0.21 14.00 19 146 1568

online 0.18 15.00 29 163 1672

vacuum 0.16 0.87 18 145 1551

cluster 0.13 0.64 16 32 214

3 . 7

Page 15: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

It gets big!It gets big!Impact on writesImpact on writes

(x4-16 points, 1.3 lines)(x4-16 points, 1.3 lines)

Fastest indexFastest index

3 . 8

Page 16: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

BRINBRIN

4 . 1

Page 17: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

... time geom

... 2019-03-13 POINT(13.8 50)

... 2019-03-13 POINT(13.8 50)

... ...

2019-03-14 POINT(13.8 51)

2019-03-16 POINT(13.7 51)

... ...

... 2019-03-16 POINT(13.7 51)

... 2019-03-16 POINT(13.7 51)

Block Range Index examples

(2019-03-13 09:00:00 , 2019-03-14 11:00:00)(A-Weg , Grunaer Str.)BBOX

4 . 2

Page 18: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

... time geom

... 2019-03-13 POINT(13.8 50)

... 2019-03-13 POINT(13.8 50)

... ...

2019-03-14 POINT(13.8 51)

2019-03-16 POINT(13.7 51)

... ...

... 2019-03-16 POINT(13.7 51)

... 2019-03-16 POINT(13.7 51)

Block Range Index examples

(2019-03-13 09:00:00 , 2019-03-14 11:00:00)(A-Weg , Grunaer Str.)BBOX

(2019-03-14 11:15:00 , 2019-03-15 18:00:00)(Grunaer Weg, Nürnberger Str.)BBOX

4 . 2

Page 19: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

... time geom

... 2019-03-13 POINT(13.8 50)

... 2019-03-13 POINT(13.8 50)

... ...

2019-03-14 POINT(13.8 51)

2019-03-16 POINT(13.7 51)

... ...

... 2019-03-16 POINT(13.7 51)

... 2019-03-16 POINT(13.7 51)

Block Range Index examples

(2019-03-13 09:00:00 , 2019-03-14 11:00:00)(A-Weg , Grunaer Str.)BBOX

(2019-03-14 11:15:00 , 2019-03-15 18:00:00)(Grunaer Weg, Nürnberger Str.)BBOX

(2019-03-16 09:00:00 , 2019-03-16 17:30:00)(Oberauer Str. , Zwinglistr.)BBOX

4 . 2

Page 20: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Block ranges. That's all.Block ranges. That's all.Data must be sorted!Data must be sorted!

PostGIS > BBox per blockPostGIS > BBox per block

4 . 3

Page 21: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

4 . 4

Page 22: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

ORDER BY geom ORDER BYST_GeoHash(geom)

4 . 5

Page 23: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

SET enable_seqscan = false;

4 . 6

Page 24: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

BRIN vs. GiSTBRIN vs. GiST

tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn

create gist 700 ms 8 sec 2 min 23 min 6 hrs

create brin 24 ms 0.2 sec 2 sec 18 sec 90 sec

size gist 5 MB 50 MB 500 MB 5 GB 50 GB

size brin 24 KB 24 KB 48 KB 376 KB 3,6 MB

duration x25 x23 x1,4 x1.6 x1.1

4 . 7

Page 25: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Super smallSuper smallBuild in secondsBuild in secondsSlower, but okSlower, but ok

4 . 8

Page 26: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

sp-GiSTsp-GiST

5 . 1

Page 27: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Framework like GISTFramework like GISTUnbalanced treeUnbalanced tree

No overlaps & prefixesNo overlaps & prefixesPostGIS > BBox in 4DPostGIS > BBox in 4D

5 . 2

Page 28: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

https://www.researchgate.net/figure/Adaptive-k-d-tree_fig9_2334587

kd-Tree, Quadtreekd-Tree, Quadtree

5 . 3

Page 29: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Trick: No Overlap via multiple dimensionsTrick: No Overlap via multiple dimensions

Each point you see onthe map are in fact 4bounding boxeswhich are the prefixesof the sp-GiST treedefining the boundsof child quadrants

5 . 4

Page 30: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

sp-GiST vs. GiSTsp-GiST vs. GiST

tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn

create gist 700 ms 8 sec 2 min 23 min 6 hrs

create spgist 344 ms 3,7 sec 50 sec 11 min 8 hrs

size gist 5 MB 50 MB 500 MB 5 GB 50 GB

size spgist 4,5 MB 44 MB 440 MB 4.3 GB 43 GB

duration x0.85 x1 x1 x1 x1.1

5 . 5

Page 31: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

12% less big as GiST12% less big as GiST2x faster for writes2x faster for writes

Less predictableLess predictable

5 . 6

Page 32: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

ConclusionConclusion

6 . 1

Page 33: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Static dataset

> available RAM

overlapsoverlaps overlaps

GiST, BRINsp-GiSTGiST sp-GiST sp-GiSTBRIN

yes

6 . 2

yes

yesyesyes

no

no

no nono

Page 34: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

General advicesGeneral advices

7 . 1

Page 35: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

7 . 2

Page 36: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

> pg_stat_statements> pg_stat_statements

7 . 2

Page 37: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

> pg_stat_statements> pg_stat_statements

> VACUUM, pg_repack> VACUUM, pg_repack

7 . 2

Page 38: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

> pg_stat_statements> pg_stat_statements

> VACUUM, pg_repack> VACUUM, pg_repack

> ANALYZE> ANALYZE

7 . 2

Page 39: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

> pg_stat_statements> pg_stat_statements

> VACUUM, pg_repack> VACUUM, pg_repack

> ANALYZE> ANALYZE

> CLUSTER> CLUSTER

7 . 2

Page 40: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

> pg_stat_statements> pg_stat_statements

> VACUUM, pg_repack> VACUUM, pg_repack

> ANALYZE> ANALYZE

> CLUSTER> CLUSTER

> Partial indexes> Partial indexes

7 . 2

Page 41: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

When to index?When to index?

Clean up bloatClean up bloat

Update statisticsUpdate statistics

Table rewriteTable rewrite

What is queried?What is queried?

Index-only scansIndex-only scans

> pg_stat_statements> pg_stat_statements

> VACUUM, pg_repack> VACUUM, pg_repack

> ANALYZE> ANALYZE

> CLUSTER> CLUSTER

> Partial indexes> Partial indexes

> Covering indexes> Covering indexes

7 . 2

Page 42: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Das wars. Fragen?Das wars. Fragen?

https://slides.com/fxku/postgis-indexinghttps://slides.com/fxku/postgis-indexing

8 . 1

Page 43: PostGIS I ndizes · 2019-03-23 · PostGIS > BBox per block 4 . 3. 4 . 4. ORDER BY geom ORDER BY ST_GeoHash(geom) 4 . 5. SET enable_seqscan = false; 4 . 6. BRIN vs. GiST tests 100

Used hardwareUsed hardware

Tuxedo Infinity Book 13Intel i7-8550U CPU 1.80GHzQuadcore, 8 CPUs32 GB RAM500GB SSD disk

PostgreSQL configPostgreSQL config

PostgreSQL 11 & PostGIS 2.5shared_buffers = 16 GBwork_mem = 128 MBmaintenance_work_mem = 4 GBmin/max_wal_level = 16/4 GBchecpoint_timeout = 30 mincheckpoint_completion_target = 0.9random_page_cost = 1.1cpu_tuple_cost = 0.001cpu_index_tuple_cost = 0.001effective_cache_size = 24 GBdefault_statistics_target = 500

RepoRepo

github.com/FxKu/postgis_indexing

8 . 2