using the postgresql extension ecosystem for advanced...
TRANSCRIPT
![Page 1: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/1.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
Using the PostgreSQL Extension Ecosystem for
Advanced Analytics
![Page 2: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/2.jpg)
[email protected] (855) 232-0320
- The problem
- The prevailing view vs. the practical reality
- A possible solution
- Or just building blocks?
- Nearness
- Near at hand, near to our skill set, near to our capabilities
- A more complete solution
- The PostgreSQL extension ecosystem
Agenda
![Page 3: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/3.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
The Problem The Prevailing View
vs. The Practical Reality
![Page 4: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/4.jpg)
[email protected] (855) 232-0320
The Prevailing View - Logical
Dimension Relational Non-Relational
Schema objects ● Structured rows and columns ● Schema on write ● Referential integrity ● Painful migrations
● Unstructured files, docs, etc ● Schema on read ● No referential integrity ● No migrations
Query languages ● SQL ● Declarative ● Easy enough for non-tech users
● Various ● Procedural ● Requires some programming skills
Exploratory analysis ● Native support for joins ● Interactive/low execution overhead
● No native support for joins ● OLAP - Batch processing
Data science and ML ● Only descriptive statistics ● Requires exporting dumps/samples
● Robust ecosystem ● Does not require exports
![Page 5: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/5.jpg)
[email protected] (855) 232-0320
The Prevailing View - Physical
Dimension Relational Non-Relational
Parallel query processing
● Single node system ● Single process per query
● Multiple node system ● Multiple processes per query
Concurrency ● High concurrency ● Single process per connection
● OLAP - low concurrency/high scheduling overhead
High Availability & Replication
● Async and sync replication ● HA may not be native
● Async and sync replication ● HA likely to be native
Sharding ● Sharding may not be native ● Difficult to manage
● Sharding likely to be native ● Easy to manage
![Page 6: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/6.jpg)
[email protected] (855) 232-0320
The Prevailing View - Summary - RDBMS have nice properties for producing rich data
- ACID, relational integrity, constraints, strong data types
- Easier for non-tech users and exploratory analysis
- Probably don’t meet the needs of today’s analysts
- Data science & Machine Learning
- Parallel processing
- Definitely don’t meet the needs of today’s apps
- Schema migrations
- Replication and sharding
![Page 8: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/8.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
But we still want more advanced functionality.
The Practical Reality
![Page 9: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/9.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
A Possible Solution Or Just Building Blocks?
![Page 10: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/10.jpg)
[email protected] (855) 232-0320
Modern SQL - Many people still think of SQL in terms of SQL-92
- Since then we’ve had: SQL:1999, SQL:2003, SQL:2006, SQL:2008,
SQL:2011
- http://use-the-index-luke.com/blog/2015-02/modern-sql
- Common Table Expressions (CTEs) / Recursive CTEs
- Window Functions
- Ordered-set Aggregates
- Lateral joins
- Temporal support
- The list goes on...
![Page 11: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/11.jpg)
[email protected] (855) 232-0320
Procedural Languages
- Native
pgSQL Tcl Perl Python
- Community
Java PHP R Javascript Ruby Scheme sh
![Page 12: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/12.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
These solve some problems. For others, they are just building blocks.
Building Blocks
![Page 13: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/13.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
Nearness Near at Hand
Near to Our Skill Set Near to Our Capabilities
![Page 15: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/15.jpg)
[email protected] (855) 232-0320
- Near at hand
- Easily installable
- Near to our skill set
- Familiar tool/language/abstraction
- Modular and composable
- Near to our capabilities
- Capable of solving a problem in our domain
Nearness Drives Adoption
![Page 16: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/16.jpg)
[email protected] (855) 232-0320
[email protected] (855) 232-0320
A More Complete Solution The PostgreSQL Extension Ecosystem
![Page 17: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/17.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 18: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/18.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 19: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/19.jpg)
[email protected] (855) 232-0320
- Package Manager: pgxn
- Index/Network: http://pgxn.org/
- PyPI, RubyGems, CPAN, CRAN
The PostgreSQL Extension Network
![Page 20: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/20.jpg)
[email protected] (855) 232-0320
The PostgreSQL Extension Network
- Near at hand
- pgxn search semver
- pgxn info semver
- pgxn install semver
- pgxn load –d somedb semver
- pgxn unload –d somedb semver
- pgxn uninstall semver
- Search github? google? mailing list?
- Github README?
- git clone; make; make install;
- psql –c “CREATE EXTENSION IF NOT EXISTS”
- psql –c “DROP EXTENSION IF EXISTS”
- make uninstall?
![Page 21: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/21.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & operators: https://github.com/eulerto/pg_similarity
- UDAs & data types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 22: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/22.jpg)
[email protected] (855) 232-0320
UDFs & Operators: pg_similarity - Near to our capabilities
- Similarity coefficient algorithms
- L1 Distance
- Cosine Distance
- Dice Coefficient
- Euclidean Distance
- Hamming Distance
- Jaccard Coefficient
- Jaro Distance
- Jaro-Winkler Distance
- Levenshtein Distance
- Matching Coefficient
- Monge-Elkan Coefficient
- Needleman-Wunsch Coefficient
- Overlap Coefficient
- Q-Gram Distance
- Smith-Waterman Coefficient
- Smith-Waterman-Gotoh Coefficient
- Soundex Distance
![Page 25: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/25.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 26: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/26.jpg)
[email protected] (855) 232-0320
UDAs & Data Types: postgresql-hll - Near to our capabilities & near to our skill set
- Data type
- Estimate count distinct with tunable precision
- 1280 bytes estimates tens of billions of distinct values with few percent error
![Page 29: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/29.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 33: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/33.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 34: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/34.jpg)
[email protected] (855) 232-0320
Indexes: ZomboDB
- Index Access Method API
- http://www.postgresql.org/docs/9.4/static/indexam.html
![Page 35: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/35.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes (GiST, GIN): https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 39: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/39.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 40: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/40.jpg)
[email protected] (855) 232-0320
Parallel Processing
- Parallel sequential scan - http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-committed.html
- Columnar FDW:
- https://github.com/citusdata/cstore_fdw
![Page 41: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/41.jpg)
[email protected] (855) 232-0320
Postgres Extension Ecosystem Examples - PostgreSQL Extension Network: http://pgxn.org/
- UDFs & Operators: https://github.com/eulerto/pg_similarity
- UDAs & Data Types: https://github.com/aggregateknowledge/postgresql-hll
- Foreign Data Wrappers: http://multicorn.org/, https://github.com/shish/pgosquery
- Indexes: https://github.com/zombodb/zombodb
- Composing Extension Methods: http://doc.madlib.net/
- MPP: https://www.citusdata.com/, https://github.com/greenplum-db/gpdb
- Composing Extensions
- Custom Background Workers: https://github.com/no0p/alps
- Record linking: http://no0p.github.io/2015/10/20/record_linking.html#/
![Page 44: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/44.jpg)
[email protected] (855) 232-0320
Beyond Analytics
- Web app framework
- http://blog.aquameta.com/
- REST API
- https://github.com/begriffs/postgrest
- Unit testing framework
- http://pgtap.org/
- Firewall
- https://github.com/uptimejp/sql_firewall
- More every week!
![Page 45: Using the PostgreSQL Extension Ecosystem for Advanced ...info.citusdata.com/rs/.../Using_the_PostgreSQL_Extensions_Ecosyste… · Using the PostgreSQL Extension Ecosystem for Advanced](https://reader034.vdocuments.site/reader034/viewer/2022052518/5f0a06557e708231d429a621/html5/thumbnails/45.jpg)
[email protected] (855) 232-0320
Conclusion
- With PostgreSQL, you get
- more than rows and columns
- more than SELECT, FROM, WHERE, GROUP BY, ORDER BY
- more than a single machine
- Make sure you get the full return on your investment!
Get your Chartio free trial!
(855) 232-0320