introduction to greenplum

1 © 2016 Pivotal Software, Inc. All rights reserved.

Introduction to Greenplum

Database January, 2016


Forward Looking Statements

This presentation contains “forward-looking statements” as defined under the Federal Securities Laws. Actual results could differ materially from those projected in the forward-looking statements as a result of certain risk factors, including but not limited to: (i) adverse changes in general economic or market conditions; (ii) delays or reductions in information technology spending; (iii) the relative and varying rates of product price and component cost declines and the volume and mixture of product and services revenues; (iv) competitive factors, including but not limited to pricing pressures and new product introductions; (v) component and product quality and availability; (vi) fluctuations in VMware’s Inc.’s operating results and risks associated with trading of VMware stock; (vii) the transition to new products, the uncertainty of customer acceptance of new product offerings and rapid technological and market change; (viii) risks associated with managing the growth of our business, including risks associated with acquisitions and investments and the challenges and costs of integration, restructuring and achieving anticipated synergies; (ix) the ability to attract and retain highly qualified employees; (x) insufficient, excess or obsolete inventory; (xi) fluctuating currency exchange rates; (xii) threats and other disruptions to our secure data centers and networks; (xiii) our ability to protect our proprietary technology; (xiv) war or acts of terrorism; and (xv) other one-time events and other important factors disclosed previously and from time to time in the filings EMC Corporation, the parent company of Pivotal, with the U.S. Securities and Exchange Commission. EMC and Pivotal disclaim any obligation to update any such forward-looking statements after the date of this release.


�  Relational database system for big data

�  Mission critical & system of record product with supporting tools and ecosystem

�  Fully open source with a global community of developers and users

�  Implement world’s leading research in database technology across all components –  Optimizer, Query Execution –  Transaction Processing, Database Storage, Compression, High Availability –  Embedded Programming Languages (Python, R, Java, etc …. ) –  In-Database analytics in domains (e.g. Geospatial, Text, Machine Learning, Mathematics, etc …. )

�  Performance tuned for multiple workload profiles –  Analytics, long running queries, short running queries, mixed workloads

�  Large industrial focused system –  Financial, Government, Telecom, Retail, Manufacturing, Oil & Gas, etc…….

Greenplum Database Mission & Strategy


�  An ambitious project –  10 years in the making –  Investment of hundred of millions of dollars –  Potential to define a new market and disrupt traditional EDW vendors

�  www.greenplum.org –  Github code –  mailing lists / community engagement –  Global project w/ external contributors

�  Pivotal Greenplum –  Enterprise software distribution & release management –  Pivotal expertise –  24-hour global support –  5.0 release in Early Q2 2016

Greenplum Open Source


PostgreSQL Compatibility

Roadmap

•  Strategic backport key features from PostgreSQL to Greenplum … JSONB, UUID,

Variadic functions, Default function arguments, etc.

•  Consistent back porting of patches from older PostgreSQL to Greenplum … Initial

goal to reach 9.0

6 © 2016 Pivotal Software, Inc. All rights reserved. 6

GPDB Architecture Overview


MPP Shared Nothing Architecture

Standby Master

Segment Host with one or more Segment Instances Segment Instances process queries in parallel

Performance Through Segment Instance Parallelism

High speed interconnect for continuous pipelining of data processing …

Master Host

SQL Master Host and Standby Master Host Master coordinates work with Segment Hosts

Interconnect

Segment Host Segment Instance Segment Instance Segment Instance Segment Instance

Segment Hosts have their own CPU, disk and memory (shared nothing)


node1


node2


node3


nodeN


Master Host

Master Segment

Catalog

Query Optimizer

Distributed TM

Dispatch Query Executor

Parser enforces syntax, semantics and produces a

parse tree

Client Accepts client connections, incoming user requests and

performs authentication

Parser

Master Host


Pivotal Query Optimizer

Local Storage

Master Segment

Catalog Distributed TM

Interconnect

Dispatcher Query Executor

Parser Query Optimizer Consumes the parse tree and

produces the query plan

Query execution plan contains how

the query is executed

Master Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage


Query Dispatcher

Local Storage

Master Segment


Interconnect

Query Optimizer

Query Executor

Parser

Dispatcher

Responsible for communicating the

query plan to segments

Allocates cluster resources required to perform the job and

accumulating/presenting final

results

Master Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage


Query Executor

Local Storage

Master Segment


Interconnect

Query Optimizer

Query Dispatcher

Parser

Query Executor

Responsible for executing the steps

in the plan (e.g. open file,

iterate over tuples)

Communicates its intermediate results

to other executor processes

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Master Host


Interconnect

Local Storage

Master Segment


Query Optimizer

Query Dispatcher

Parser

Query Executor

Interconnect

Responsible for serving tuples from

one segment to another (motion operations) to

perform joins, etc.

Uses UDP for optimal performance

and scalability

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Master Host


System Catalog

Local Storage

Master Segment

Query Executor

Distributed TM

Interconnect

Query Optimizer

Query Dispatcher

Parser

Catalog

Stores and manages metadata for

databases, tables, columns, etc.

Master keeps a copy of the metadata coordinated on

every segment host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Master Host


Distributed Transaction Management

Local Storage

Master Segment

Query Executor

Catalog

Interconnect

Query Optimizer

Query Dispatcher

Parser

Distributed TM

Segments have their own commit and replay logs and decide when to commit, abort for

their own transactions

DTM resides on the master and

coordinates the commit and abort

actions of segments

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Host

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Segment Instance

Local TM

Query Executor

Catalog

Local Storage

Master Host


GPDB High Availability �  Master Host mirroring

–  Warm Standby Master Host ▪  Replica of Master Host system catalogs

–  Eliminates single point of failure –  Synchronization process between Master Host and Standby Master Host

▪  Uses PostgreSQL WAL Replication

�  Segment mirroring –  Creates a mirror segment for every primary segment

▪  Uses a custom file block replication process

–  If a primary segment becomes unavailable automatic failover to the mirror


Fault Detection and Recovery �  ftsprobe fault detection process monitors and scans segments and database

processes at configurable intervals

�  Query gp_segment_configuration catalog table for detailed information about a failed segment

▪  $ psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"

�  When ftsprobe cannot connect to a segment it marks it as down –  Will remain down until administrator manually recovers the failed segment

using gprecoverseg utility

�  Automatic failover to the mirror segment –  Subsequent connection requests are switched to the mirror segment


CREATE TABLE Define Data Distributions �  One of the most important aspects of GPDB!

�  Every table has a distribution method

�  DISTRIBUTED BY (column) –  Uses a hash distribution

�  DISTRIBUTED RANDOMLY –  Uses a random distribution which is not guaranteed to provide a perfectly even

distribution

�  Explicitly define a column or random distribution for all tables –  Do not use the default


DISTRIBUTED BY (column_name) •  Use a single column that will distribute data across all

segments evenly

•  For large tables significant performance gains can be obtained with local joins (co-located joins) –  Distribute on the same column for tables commonly joined together

•  Co-located join is performed within the segment –  Segment operates independently of other segments

•  Co-located join eliminates or minimizes motion operations –  Broadcast motion or Redistribute motion


Use the Same Distribution Key for Commonly Joined Tables

= Distribute on the same key used in the join

to obtain local joins

Segment 1A

Segment 2A

customer (c_customer_id)

freg_shopper (f_customer_id)


freq_shopper (f_customer_id)

=

=


Redistribution Motion

WHERE customer.c_customer_id = freg_shopper.f_customer_id freq_shopper table is dynamically redistributed on f_customer_id

Segment 1A

customer (c_customer_id) customer_id =102

freg_shopper (f_trans_number)

Segment 2A

customer (c_customer_id) customer_id=745

freq_shopper (f_trans_number) customer_id=102

Segment 3A


freq_shopper (f_trans_number) customer_id=745


Broadcast Motion

WHERE customer.c_statekey = state.s_statekey The state table is dynamically broadcasted to all segments

Segment 1A

Segment 2A

Segment 3A


state (s_statekey) AK, AL, AZ, CA…






Data Distribution: The Key to Parallelism The primary strategy and goal is to spread data evenly across all segment instances. Most important in a MPP shared nothing architecture!

43 Oct 20 2005 12 64 Oct 20 2005 111 45 Oct 20 2005 42 46 Oct 20 2005 64 77 Oct 20 2005 32 48 Oct 20 2005 12

Order

Ord

er #

Ord

er

Dat

e

Cus

tom

er

ID

50 Oct 20 2005 34 56 Oct 20 2005 213 63 Oct 20 2005 15 44 Oct 20 2005 102 53 Oct 20 2005 82 55 Oct 20 2005 55


Master

Parallel Data Scans Across All Segments

SELECT COUNT(*) FROM orders WHERE order_date >= ‘Oct 20 2007’ AND order_date < ‘Oct 27 2007’

4,423,323

Each Segment Scans Data Simultaneously in Parallel

Segment 1A Segment 1B Segment 1C Segment 1D






Segments Return Results Return Results Send Plan to Segments Develop Query Plan


CREATE TABLE Define Partitioning �  Reduces the amount of data to be scanned by reading only the relevant data

needed to satisfy a query –  The only goal of partitioning is to achieve partition elimination aka partition

pruning

�  Is not a substitution for distributions –  A good distribution strategy and partitioning that achieves partition

elimination unlocks performance magic

�  Uses table inheritance and constraints –  Persistent relationship between parent and child tables











Distributions and Partitioning SELECT COUNT(*) FROM orders WHERE order_date >= ‘Oct 20 2007’ AND order_date < ‘Oct 27 2007’

&

Evenly distribute orders data across all segments Only scans the relevant order partitions





Define the Storage Model CREATE TABLE

�  Heap Tables versus Append Optimized (AO) Tables

�  Row oriented storage versus Column oriented storage

�  Compression –  Table level compression applied to entire table –  Column level compression applied to a specific column w/ columnar storage –  Zlib level with Run Length Encoding Optional


Heap Tables or AO Tables •  Use heap for tables and partitions that will receive singleton

UPDATE, DELETE and INSERT operations

•  Use heap storage for tables and partitions that will receive concurrent UPDATE, DELETE and INSERT operations

•  Use AO for tables and partitions that are updated infrequently after the initial load and subsequent inserts or updates are only performed in large batch operations


GPDB Data Loading Options Loading Method Common Uses Examples

INSERTS •  Operational Workloads •  OBDC/JDBC Interfaces

INSERT INTO performers (name, specialty) VALUES (‘Sinatra’, ‘Singer’);

COPY

•  Quick and easy data in •  Legacy PostgreSQL applications •  Output sample results from SQL statements

COPY performers FROM ‘/tmp/comedians.dat’ WITH DELIMITER ‘|’;

External Tables

•  High speed bulk loads •  Parallel loading using gpfdist protocol •  Local file, remote file, HTTP or HDFS based

sources

INSERT INTO craps_bets SELECT g.bet_type , g.bet_dttm , g.bt_amt FROM x_allbets b JOIN games g ON ( g.id = b.game_id ) WHERE g.name = ‘CRAPS’;

GPLOAD

•  Simplifies external table method (YAML wrapper )

•  Supports Insert, Merge & Update

gpload –f blackjack_bets.yml


Example Load Architectures

Master Host

Segment Host Segment Host Segment Host

ETL Host

Data file Data file

Data file

Data file Data file

Data file

gpdfdist gpdfdist

ETL Host

Data file Data file

Data file

Data file Data file

Data file

gpdfdist gpdfdist

Segment Instance

Segment Instance

Segment Instance

Segment Instance

Segment Instance

Segment Instance

Master Instance

Segment Host

Segment Instance

Segment Instance

Singleton INSERT statement

COPY statement

INSERT via external table or gpload


Load Using Regular External Tables � File based (flat files) –  gpfdist provides the best performance

=# CREATE EXTERNAL TABLE ext_expenses (name text, date date, amount float4, category text, description text) LOCATION ( ‘gpfdist://etlhost:8081/*.txt’, ‘gpfdst://etlhost:8082/*.txt’) FORMAT ’TEXT' (DELIMITER ‘|’ );

$ gpfdist –d /var/load_files1/expenses –p 8081 –l /home/gpadmin/log1 &

$ gpfdist –d /var/load_files2/expenses –p 8082 –l /home/gpadmin/log2 &


ANALYZEDB and Database Statistics •  Accurate statistics are critical for the query optimizer to generate optimal

query plans –  When a table is analyzed table information about the data is stored

into system catalog tables

•  Always update statistics after loading data

•  Always update statistics after CREATE INDEX operations

•  Always update statistics after INSERT, UPDATE and DELETE operations that significantly changes the underlying data


ANALYZEDB Parallel ANALYZE sessions

•  Invoke concurrent ANALYZE sessions

•  Each session is at individual table/partition level

•  For example: analyzedb -d myDB -t public.big_fact_table -p 4

•  Parallel level p between 1 and 10. Default value 5.

•  Tune parallel level according to system load

•  In general, 3~5x speed up over single session


ANALYZEDB Incremental ANALYZE •  If a table/partition has not changed (DML, DDL) since last run of

ANALYZEDB, it will be skipped automatically

•  ANALYZEDB keeps a record of which tables have up-to-state stats after a run on disk in $MASTER_DATA_DIRECTORY/db_analyze

•  ANALYZEDB compares the current catalog with the state files of last run to determine the incremental

•  ANALYZEDB captures statistics on root partition table required for the Pivotal Query Optimizer (PQO)


ANALYZEDB Details

•  Incremental analyze does not apply to heap tables

•  Heap tables are always analyzed

•  Catalog tables, views and external tables are automatically skipped


ANALYZEDB Miscellaneous •  Gently kill analyzedb by Ctrl+C or sending SIGINT – it will resume

at where it left off when restarted

•  Print out progress report while running

•  Refresh root partition stats for the Pivotal Query Optimizer automatically

•  Analyze tables in descending OID order

•  Use analyzedb -? for other options (using config file, include/exclude columns, dry run, force non-incremental, quiet mode)


Greenplum source code major differences w/ PostgreSQL

https://github.com/greenplum-db/gpdb/tree/master/gpMgmt Python cluster management code

https://github.com/greenplum-db/gpdb/tree/master/gpAux/gpperfmon

Performance and system management code https://github.com/greenplum-db/gpdb/tree/master/src/backend/access/appendonly

Append-optimized and columnar tables https://github.com/greenplum-db/gpdb/tree/master/src/backend/access/external

External tables https://github.com/greenplum-db/gpdb/tree/master/src/backend/cdb

Main cluster database code, such as mirroring etc https://github.com/greenplum-db/gpdb/tree/master/src/backend/cdb/motion

Interconnect between nodes


Core Greenplum Engine

UDP Interconnect Flow Control

Roadmap

•  Replicated Tables; High Performance Temp Tables

•  Faster Query Dispatch; Short Query Performance

•  Query Plan Code Generation

•  Small Material Aggregates

•  Refactor Analyze for Performance Gains


Polymorphic Storage™ User Definable Storage Layout

�  Columnar storage compresses better �  Optimized for retrieving a subset of the

columns when querying �  Compression can be set differently per

column: gzip (1-9), quicklz, delta, RLE

�  Row oriented faster when returning all columns

�  HEAP for many updates and deletes �  Use indexes for drill through queries

TABLE ‘SALES’ Jun

Column-oriented Row-oriented

Oct Year -1

Year -2

External HDFS �  Less accessed partitions

on HDFS with external partitions to seamlessly query all data

�  Text, CSV, Binary, Avro, Parquet format

�  All major HDP Distros

Nov Dec Jul Aug Sep

Roadmap •  GPHDFS Predicate Pushdown •  S3 Object Store External Tables •  GPDB to GPDB External Tables •  HAWQ External Tables


Pivotal Greenplum Roadmap Highlights ●  S3 External Tables ●  Performance tuned for AWS ●  Dynamic Code Generation using

LLVM ●  Short running query performance

enhancements ●  Faster analyze ●  WAL Replication Segment

Mirroring ●  Incremental restore MVP ●  Disk space full warnings ●  Snapshot Backup

●  Anaconda Python Modules: NLTK, etc

●  Time Series Gap Filling ●  Complex Numbers ●  PostGIS Raster Support ●  Geospatial Trajectories ●  Path analytics ●  Enhanced SVM module ●  Py-Madlib ●  Lock Free Backup


•  Government detection of benefits that should not be made •  Government detection of tax fraud •  Government economic statistics research database •  Commercial banking wealth management data science and product development •  Commercial clearing corporation's risk and trade repositories reporting •  Pharmaceutical company vaccine potency prediction based on manufacturing sensors •  401K providers analytics on investment choices •  Auto manufacturer’s analytics on predictive maintenance •  Corporate/Financial internal email and communication surveillance and reporting •  Oil drilling equipment predictive maintenance •  Mobile telephone company enterprise data warehouse •  Retail store chain customer purchases analytics •  Airlines loyalty program analytics •  Telecom company network performance and availability analytics •  Corporate network anomalous behavior and intrusion detections •  Semiconductor Fab sensor analytics and reporting

Highlighted Greenplum successes

introduction to greenplum

Data & Analytics