postgis i ndizes · 2019-03-23 · postgis > bbox per block 4 . 3. 4 . 4. order by geom order by...
TRANSCRIPT
PostGIS IndizesPostGIS IndizesWelcher ist der richtige?
Felix KundeF O S S
IG
2019
1
CREATE INDEX pts_spx ON point_table USING GIST (geom)
2
CREATE INDEX pts_spx ON point_table USING GIST (geom)
BRIN < v2.3< v2.3
2
CREATE INDEX pts_spx ON point_table USING GIST (geom)
BRIN
SPGIST< v2.3< v2.3
< v2.5< v2.5
2
GiSTGiST
3 . 1
It's a frameworkIt's a frameworkPostGIS > R-TreePostGIS > R-Tree
Stores BBoxStores BBox
3 . 2
3 . 3
Leaves
3 . 4
Nodes
3 . 5
Root
Level 1 Level 2
3 . 6
ST_Intersects in msST_Intersects in ms
tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn
no index 18.00 87.00 670 6473 135529
bulk 0.21 14.00 19 146 1568
online 0.18 15.00 29 163 1672
vacuum 0.16 0.87 18 145 1551
cluster 0.13 0.64 16 32 214
3 . 7
ST_Intersects in msST_Intersects in ms
tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn
no index 18.00 87.00 670 6473 135529
bulk 0.21 14.00 19 146 1568
online 0.18 15.00 29 163 1672
vacuum 0.16 0.87 18 145 1551
cluster 0.13 0.64 16 32 214
3 . 7
ST_Intersects in msST_Intersects in ms
tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn
no index 18.00 87.00 670 6473 135529
bulk 0.21 14.00 19 146 1568
online 0.18 15.00 29 163 1672
vacuum 0.16 0.87 18 145 1551
cluster 0.13 0.64 16 32 214
3 . 7
ST_Intersects in msST_Intersects in ms
tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn
no index 18.00 87.00 670 6473 135529
bulk 0.21 14.00 19 146 1568
online 0.18 15.00 29 163 1672
vacuum 0.16 0.87 18 145 1551
cluster 0.13 0.64 16 32 214
3 . 7
It gets big!It gets big!Impact on writesImpact on writes
(x4-16 points, 1.3 lines)(x4-16 points, 1.3 lines)
Fastest indexFastest index
3 . 8
BRINBRIN
4 . 1
... time geom
... 2019-03-13 POINT(13.8 50)
... 2019-03-13 POINT(13.8 50)
... ...
2019-03-14 POINT(13.8 51)
2019-03-16 POINT(13.7 51)
... ...
... 2019-03-16 POINT(13.7 51)
... 2019-03-16 POINT(13.7 51)
Block Range Index examples
(2019-03-13 09:00:00 , 2019-03-14 11:00:00)(A-Weg , Grunaer Str.)BBOX
4 . 2
... time geom
... 2019-03-13 POINT(13.8 50)
... 2019-03-13 POINT(13.8 50)
... ...
2019-03-14 POINT(13.8 51)
2019-03-16 POINT(13.7 51)
... ...
... 2019-03-16 POINT(13.7 51)
... 2019-03-16 POINT(13.7 51)
Block Range Index examples
(2019-03-13 09:00:00 , 2019-03-14 11:00:00)(A-Weg , Grunaer Str.)BBOX
(2019-03-14 11:15:00 , 2019-03-15 18:00:00)(Grunaer Weg, Nürnberger Str.)BBOX
4 . 2
... time geom
... 2019-03-13 POINT(13.8 50)
... 2019-03-13 POINT(13.8 50)
... ...
2019-03-14 POINT(13.8 51)
2019-03-16 POINT(13.7 51)
... ...
... 2019-03-16 POINT(13.7 51)
... 2019-03-16 POINT(13.7 51)
Block Range Index examples
(2019-03-13 09:00:00 , 2019-03-14 11:00:00)(A-Weg , Grunaer Str.)BBOX
(2019-03-14 11:15:00 , 2019-03-15 18:00:00)(Grunaer Weg, Nürnberger Str.)BBOX
(2019-03-16 09:00:00 , 2019-03-16 17:30:00)(Oberauer Str. , Zwinglistr.)BBOX
4 . 2
Block ranges. That's all.Block ranges. That's all.Data must be sorted!Data must be sorted!
PostGIS > BBox per blockPostGIS > BBox per block
4 . 3
4 . 4
ORDER BY geom ORDER BYST_GeoHash(geom)
4 . 5
SET enable_seqscan = false;
4 . 6
BRIN vs. GiSTBRIN vs. GiST
tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn
create gist 700 ms 8 sec 2 min 23 min 6 hrs
create brin 24 ms 0.2 sec 2 sec 18 sec 90 sec
size gist 5 MB 50 MB 500 MB 5 GB 50 GB
size brin 24 KB 24 KB 48 KB 376 KB 3,6 MB
duration x25 x23 x1,4 x1.6 x1.1
4 . 7
Super smallSuper smallBuild in secondsBuild in secondsSlower, but okSlower, but ok
4 . 8
sp-GiSTsp-GiST
5 . 1
Framework like GISTFramework like GISTUnbalanced treeUnbalanced tree
No overlaps & prefixesNo overlaps & prefixesPostGIS > BBox in 4DPostGIS > BBox in 4D
5 . 2
https://www.researchgate.net/figure/Adaptive-k-d-tree_fig9_2334587
kd-Tree, Quadtreekd-Tree, Quadtree
5 . 3
Trick: No Overlap via multiple dimensionsTrick: No Overlap via multiple dimensions
Each point you see onthe map are in fact 4bounding boxeswhich are the prefixesof the sp-GiST treedefining the boundsof child quadrants
5 . 4
sp-GiST vs. GiSTsp-GiST vs. GiST
tests 100 k 1 Mio 10 Mio 100 Mio 1 Bn
create gist 700 ms 8 sec 2 min 23 min 6 hrs
create spgist 344 ms 3,7 sec 50 sec 11 min 8 hrs
size gist 5 MB 50 MB 500 MB 5 GB 50 GB
size spgist 4,5 MB 44 MB 440 MB 4.3 GB 43 GB
duration x0.85 x1 x1 x1 x1.1
5 . 5
12% less big as GiST12% less big as GiST2x faster for writes2x faster for writes
Less predictableLess predictable
5 . 6
ConclusionConclusion
6 . 1
Static dataset
> available RAM
overlapsoverlaps overlaps
GiST, BRINsp-GiSTGiST sp-GiST sp-GiSTBRIN
yes
6 . 2
yes
yesyesyes
no
no
no nono
General advicesGeneral advices
7 . 1
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
7 . 2
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
> pg_stat_statements> pg_stat_statements
7 . 2
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
> pg_stat_statements> pg_stat_statements
> VACUUM, pg_repack> VACUUM, pg_repack
7 . 2
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
> pg_stat_statements> pg_stat_statements
> VACUUM, pg_repack> VACUUM, pg_repack
> ANALYZE> ANALYZE
7 . 2
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
> pg_stat_statements> pg_stat_statements
> VACUUM, pg_repack> VACUUM, pg_repack
> ANALYZE> ANALYZE
> CLUSTER> CLUSTER
7 . 2
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
> pg_stat_statements> pg_stat_statements
> VACUUM, pg_repack> VACUUM, pg_repack
> ANALYZE> ANALYZE
> CLUSTER> CLUSTER
> Partial indexes> Partial indexes
7 . 2
When to index?When to index?
Clean up bloatClean up bloat
Update statisticsUpdate statistics
Table rewriteTable rewrite
What is queried?What is queried?
Index-only scansIndex-only scans
> pg_stat_statements> pg_stat_statements
> VACUUM, pg_repack> VACUUM, pg_repack
> ANALYZE> ANALYZE
> CLUSTER> CLUSTER
> Partial indexes> Partial indexes
> Covering indexes> Covering indexes
7 . 2
Das wars. Fragen?Das wars. Fragen?
https://slides.com/fxku/postgis-indexinghttps://slides.com/fxku/postgis-indexing
8 . 1
Used hardwareUsed hardware
Tuxedo Infinity Book 13Intel i7-8550U CPU 1.80GHzQuadcore, 8 CPUs32 GB RAM500GB SSD disk
PostgreSQL configPostgreSQL config
PostgreSQL 11 & PostGIS 2.5shared_buffers = 16 GBwork_mem = 128 MBmaintenance_work_mem = 4 GBmin/max_wal_level = 16/4 GBchecpoint_timeout = 30 mincheckpoint_completion_target = 0.9random_page_cost = 1.1cpu_tuple_cost = 0.001cpu_index_tuple_cost = 0.001effective_cache_size = 24 GBdefault_statistics_target = 500
RepoRepo
github.com/FxKu/postgis_indexing
8 . 2