postgresql as an alternative to mssql
TRANSCRIPT
![Page 1: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/1.jpg)
Alexei KrasnerNov 2015
PostgreSQL as MSSQL Alternative
![Page 2: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/2.jpg)
What is PostgreSQL▪ Powerful, open source object-relational database system.▪ 15 years of active development and strong reputation.▪ Runs on all major operating systems (Linux, Unix, Mac
OS, Windows…).▪ Enterprise class database.▪ Large and responsive community.▪ Winner of the 2015 Database Trends and Applications
Readers Choice:– The most advanced open source database.– Best relational database.
![Page 3: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/3.jpg)
Lets Start With Standards▪ Fully ACID compliant.▪ Includes most of SQL:2008 data types along with
storage of binary objects.▪ Conforms to the ANSI-SQL:2008 standard:– Full support for subqueries (including sub-selects).– Read-Committed and serializable transaction isolation levels.– Full support for Primary keys, Foreign Keys, Joins, Views, Triggers,
Stored Procedures, Restrictions (check, unique and not null) and Cascading.
– Fully relational system catalog – multiple schema per database.▪ Native programming interfaces: Java, .NET, C/C++, Perl,
Python, ODBC
![Page 4: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/4.jpg)
Continue With a Little of Splurging▪ Multi-Version Concurrency Control (MVCC).▪ Asynchronous Replication, Load Balancing and Online/Hot Backups with
Point in Time Recovery.▪ Write Ahead Logging – fault tolerance.▪ Performance:
– Sophisticated Query Planner/Optimizer.– Compound, Unique, Partial and functional indexes.
▪ Supports: – International character sets, multi-byte encodings, Unicode, locale awareness.– Built-in Types – Geospatial, XML, JSON\JSONB, Ranges and Arrays!– NoSQL – Key-Value store with incredible performance and Full Text Search.
▪ Highly customizable and extensible.
![Page 5: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/5.jpg)
Before We Dive – Generalized Search Tree (GiST)▪ Advanced indexing system – different sorting and
searching algorithms:– B-tree, B+-tree, R-tree, Partial Sum trees, ranked B+-trees etc.– API for creating custom data types and extensible query methods
for search.▪ Decide WHAT to persist, HOW to persist and a way to
SEARCH for it.▪ Exceeds the general search algorithms using standard
B\R-trees.▪ Foundation for many public projects – OpenFTS and
PostGIS
![Page 6: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/6.jpg)
Features Deep Dive
▪ MVCC▪ Partitioning▪ Useful Data Types– Date and Time– Interval– Array– Ranges– JSON– HSTORE– XML
▪ PostGIS – Geographic
▪ Full Text Search▪ Server Side
Programming▪ Backup and Restore▪ High Availability,
Load Balancing and Replication– Sharding
▪ Big Data Readiness
![Page 7: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/7.jpg)
Multi Version Concurrency Control - MVCC▪ Reads should never block writes and
vice versa.▪ Each transaction sees a snapshot of
data (version).– Protection from viewing inconsistency –
transaction isolation.▪ Avoidance of explicit locking solutions
– minimize lock contention.▪ Table\Row level locking mechanism is
still available – although proper MVCC usage will provide performance benefits.
![Page 8: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/8.jpg)
Partitioning – Table Inheritance▪ Support of basic table partitioning via the table
inheritance concept.– Includes known partitioning benefits:▪ Improved heavy load query performance (on a single partition).▪ Sequential scan of a partition instead of index usage.▪ Bulk loads and deletes accomplished by adding or removing partitions.▪ Infrequent data can be migrated to a cheaper\slower storage solution.
– Range Partitioning:▪ Table partitioned into “ranges” defined by a single\set key column (e.g.
dates).– List Partitioning:▪ Table partitioned into a list of discrete values as partitioning keys.
– Hundred partitions is an acceptable limit, thousands of partitions will crucially harm performance.
![Page 9: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/9.jpg)
Useful Data Types▪ Date and Time – Date, Time, TimeStamp and
TimeStamp with zone.– Converted to and from Unix time.– Supports the INTERVAL type.– Very convenient casting and conversion to text.– Performance wise searching and sorting algorithms (including
zone\offset).▪ INTERVAL – representation of a period of time.– Possible negative interval values (e.g. year ago).– Intuitive arithmetic and persistence of time durations– Easy casting and converting to relevant types.– Performance wise searching and sorting algorithms on intervals.
![Page 10: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/10.jpg)
Useful Data Types Cont.▪ Array – supported as first-class datatype (actual field in
a table).– Contain any datatype (sub arrays too).– Parameters to functions as an array.– Usages – Functions results, aggregations, get\set array of data in\
from the application.▪ Range – Supported as first-class datatype.– Put range on TIME, INT or NUMERIC as a single data value.– Possible dedicated indexes to support queries utilizing ranges.– Exposed methods to define custom ranges.
![Page 11: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/11.jpg)
Useful Data Types Cont.▪ JSON – full support along with large dedicated set of utility
functions.– Known JSON\JSONB benefits – data transfer and integration
standard.– Transformation from\to types and tables.– Retrieval and construction of JSON data.– Parsing, casting and conversion.
▪ HSTORE – Fast key-value store as a datatype.– NoSQL capabilities – flexibility of schema-less data store.– Still ACID compliant.– Interchange data between JSON and HSTORE.
![Page 12: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/12.jpg)
Useful Data Types Cont.▪ XML – Supported as a first-class datatype.– Check well formedness + type-safe operations.– Querying using Xpath.– Producing XML content, Predicates, Processing, Mapping tables to
XML etc.
![Page 13: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/13.jpg)
PostGIS▪ Fully featured, reliable geospatial database project base on GiST
(Following ISO OGC)▪ SQL types and functions to manage vector geometries (spatial
data).▪ Capabilities:– Support for three dimensional data.– Support for geospatial formats (KML, GeoJSON)– Processing and analytics functions for vector and raster data.– Map “rastering” and geo queries.– Geo searches and reverse geo searches.
▪ Huge popularity and respect extension module – compered to ArcGIS
![Page 14: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/14.jpg)
Full Text Search▪ Online indexing of data and relevance ranking for
database searches.▪ Good Enough:– Stemming– Ranking– Multilingual– Fuzzy searches (misspelling)\ Accent.
![Page 15: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/15.jpg)
Server Side Programming▪ Super Extensible – functions, data types, procedural
languages, operators, aggregates etc.– Embedding Functions and Stored Procedures using procedural– PL/pgSQL, PL/Tcl, PL/Perl, PL/Python
▪ Triggers – tables, views and foreign tables.▪ Event Triggers – database global trigger.▪ Rule System – Query modification based on given rules.
![Page 16: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/16.jpg)
Backup and Restore▪ Extremely flexible dump utility – migration, replication
and backups becomes more reliable, controllable and configurable.– Compressed format or plain SQL (human readable).– Single table or whole database cluster.
▪ Approaches:– SQL Dump – file with generated SQL commands. On restore the
backed up commands will be replayed.– File system level backup – direct copy of PostgreSQL data files.
Restore will include reattaching the data files.– Continuous archiving – backing up Write Ahead Log (WAL) files.
On restore log commands will be replayed.
![Page 17: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/17.jpg)
High Availability, Load Balancing and ReplicationFeature Shared Disk
FailoverFile System Replication
Transaction Log Shipping
Trigger-Based Master-Standby Replication
Statement-Based Replication Middleware
Asynchronous Multimaster Replication
Synchronous Multimaster Replication
Most Common Implementation NAS DRBD Streaming Repl. Slony pgpool-II Bucardo
Communication Method shared disk disk blocks WAL table rows SQL table rows table rows and row
locksNo special hardware required X X X X X X
Allows multiple master servers X X X
No master server overhead X X X
No waiting for multiple servers X with sync off X X
Master failure will never lose data X X with sync on X X
Standby accept read-only queries with hot X X X X
Per-table granularity X X XNo conflict resolution necessary
X X X X X
![Page 18: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/18.jpg)
Sharding and Replication▪ Pure Sharding:– pg_shard – popular sharding extension for PostgreSQL.▪ Running on Linux!
– BDR/UDR Project – Bi-Directional Replication which adds multi-master replication to PostgreSQL.▪ Running on Linux! Migration to windows only in a non-near future.▪ Forked of the main PostgreSQL source.
– Postgres-XL – all purpose fully ACID open source scale-out db solution. ▪ Running on Linux!▪ Forked of the main PostgreSQL source.
![Page 19: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/19.jpg)
Sharding and Replication Cont.▪ Via Replication:– Hot Standby – Reducing read loads from Master to slaves
(horizontal scale).– Streaming (or Bucardo, or other possible option) replication to
slaves.– Load balancing “write” queries to Master, “read” queries to
slaves.
![Page 20: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/20.jpg)
PostgreSQL and Big Data▪ PostgreSQL was used a decade before Hadoop launched, for
large data volumes and complex analytics (as the only pure open source).
▪ Today heavily used in mid-sized warehouses and data-marts (1-10 TB).
▪ Source of code for many big data systems:– Netezza (IBM).– Greenplum (Pivotal) – Open Source Massively Parallel Data Warehouse.– PipelineDB – open source, run SQL queries continuously on streaming data.– EnterpriseDB and CitusDB (commercial license) – fully scaled out Postgres.– Redshift (Amazon).
▪ PostgreSQL project continuously provide new features and better performance to support big data usage.
![Page 21: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/21.jpg)
PostgreSQL and Big Data – Features▪ Serious NoSQL database competitor.– JSON\B advanced features and ongoing massive development plan .– Extensions that provide NoSQL like API.
▪ Faster Sorts – text and long numeric sorting improvements.▪ TABLESAMPLE – result set of pseudo-random number of
rows to provide a data glimpse for further analysis.▪ Cubes, Rollups and Grouping Sets – summarizing and
exploring huge data sets in the OLAP way.▪ BRIN indexes – much faster, suits for TBs size tables on
incrementally increasing value fields (like timestamps or integers).
![Page 22: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/22.jpg)
PostgreSQL and Big Data – Features Cont.▪ Foreign Data Wrappers – linking external data (for
querying like local) for hybrid solutions.– Foreign schema import.– JOIN pushdowns
▪ Vacuum (garbage collection – deleting) – became parallel with multi-process mode (maintaining several large tables at once).
▪ Scaling UP – Multicore scalability improvements.
![Page 23: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/23.jpg)
Enterprise Wise
▪ Open Source▪ Reliability▪ Authentication▪ Logging▪ Documentation▪ Support▪ Maintenance
![Page 24: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/24.jpg)
Open Source▪ Available under the open source license – PostgreSQL
License.▪ Using, modifying and distributing in any open\close
form.▪ Extending and patching the relational database per
project\client etc.▪ Variety of modules, extensions and tools based on its
open source license.
![Page 25: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/25.jpg)
Reliability▪ PostgreSQL is relatively bug-free (compared to MSSQL).▪ Very large community reporting, fixing\workarounds
bugs.▪ Constantly growing community
![Page 26: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/26.jpg)
Authentication▪ Trust Authentication.▪ Password Authentication.▪ GSSAPI\SSPI Authentication – using Kerberos.▪ Ident Authentication.▪ Peer Authentication.▪ LDAP Authentication▪ RADIUS Authentication.▪ Certificate Authentication.▪ Pluggable Authentication Modules.
![Page 27: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/27.jpg)
Logging▪ Logs in one place.– Unlike MSSQL – error logs, event log, profiler log, agent log…
▪ Easily configurable logging level.▪ Easily redirect to CSV files and shipped to tables.▪ Easily redirect to System Log, Windows Event Log.▪ Logs are human readable with a great sysadmin value.
![Page 28: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/28.jpg)
Documentation▪ There is nothing more to add than a link:
http://www.postgresql.org/docs/
![Page 29: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/29.jpg)
Support▪ Community based support – seems like a fast one too.▪ Numerous companies specialized in enterprise support:
http://www.postgresql.org/support/professional_support/▪ Enterprise database management companies like:
EnterpriseDB▪ Total Cost of Ownership is significantly lower even with
enterprise support. (Based on reports. e.g. Gartner 2015).
![Page 30: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/30.jpg)
vs. MySQL
▪ ACID fully! compliant.▪ Subqueries and Joins.▪ Better locking mechanism.▪ JSON\JSONB support.▪ NoSQL and Key-Value store.▪ Advanced GIS abilities.▪ Full Text Search abilities.▪ Advanced and attractive data types.▪ Way better and useful extensibility patterns. ▪ Licensing issues.
![Page 31: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/31.jpg)
vs. PostgreSQL
▪ Partitioning based on table inheritance (Pros. and Cons.)
▪ Can be an overkill in case of simple read-heavy operations. (Improved in newer versions).
▪ Replication and Clustering (especially multi-master). Not “there” yet, but on a right track.
▪ Popularity – not as popular as MySQL (for example) but gains popularity constantly, as opposite to MySQL.
▪ Expertise issues – different syntax and administration (compared to MSSQL).
![Page 32: PostgreSQL as an Alternative to MSSQL](https://reader036.vdocuments.site/reader036/viewer/2022062412/587332501a28ab596c8b6dc9/html5/thumbnails/32.jpg)
THANK YOU