pivotal greenplum database · 2020. 7. 1. · chapter 2: pivotal greenplum 5.10.0 release...

1612
PRODUCT DOCUMENTATION Pivotal Greenplum Database ® Version 5.10.0 Pivotal Greenplum Database Documentation Rev: A01 © 2018 Pivotal Software, Inc.

Upload: others

Post on 26-Jan-2021

35 views

Category:

Documents


0 download

TRANSCRIPT

  • PRODUCT DOCUMENTATION

    Pivotal™ GreenplumDatabase®Version 5.10.0

    Pivotal Greenplum DatabaseDocumentationRev: A01

    © 2018 Pivotal Software, Inc.

  • Copyright OpenTopic

    2

    Notice

    Copyright

    Privacy Policy | Terms of Use

    Copyright © 2018 Pivotal Software, Inc. All rights reserved.

    Pivotal Software, Inc. believes the information in this publication is accurate as of its publication date. Theinformation is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED"AS IS." PIVOTAL SOFTWARE, INC. ("Pivotal") MAKES NO REPRESENTATIONS OR WARRANTIES OF ANYKIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMSIMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

    Use, copying, and distribution of any Pivotal software described in this publication requires an applicablesoftware license.

    All trademarks used herein are the property of Pivotal or their respective owners.

    Revised July 2018 (5.10.0)

    http://pivotal.io/privacy-policyhttp://pivotal.io/terms-of-use

  • Contents OpenTopic

    3

    Contents

    Chapter 2: Pivotal Greenplum 5.10.0 Release Notes............................. 14Welcome to Pivotal Greenplum 5.10.0............................................................................................. 15New Features.................................................................................................................................... 16

    gpcopy Enhancements........................................................................................................... 16Bypass Resource Group Concurrent Transaction Limits....................................................... 16gpload Performance Enhancement........................................................................................ 17gpbackup S3 Plugin Enhancements...................................................................................... 17Storage Plugin API Execution Scope.....................................................................................17Filter Pushdown for External Table Protocols........................................................................18Pivotal Greenplum-Kafka Connector (Experimental)..............................................................18

    Changed Features.............................................................................................................................19Experimental Features...................................................................................................................... 20Differences Compared to Open Source Greenplum Database.........................................................21Supported Platforms..........................................................................................................................22

    Veritas NetBackup.................................................................................................................. 24Supported Platform Notes...................................................................................................... 24

    Pivotal Greenplum Tools and Extensions Compatibility................................................................... 26Client Tools.............................................................................................................................26Extensions...............................................................................................................................27Pivotal Greenplum Data Connectors......................................................................................28Pivotal GPText Compatibility.................................................................................................. 28Pivotal Greenplum Command Center.................................................................................... 28

    Hadoop Distribution Compatibility..................................................................................................... 29Upgrading to Greenplum Database 5.10.0....................................................................................... 30

    Upgrading from 5.x to 5.10.0................................................................................................. 31Troubleshooting a Failed Upgrade.........................................................................................32

    Migrating Data to Pivotal Greenplum 5.x..........................................................................................33Pivotal Greenplum on DCA Systems................................................................................................35

    Installing the Pivotal Greenplum 5.10.0 Software Binaries on DCA Systems........................ 35Upgrading from 5.x to 5.10.0 on DCA Systems.....................................................................35

    Resolved Issues................................................................................................................................ 37Known Issues and Limitations.......................................................................................................... 41Update for gp_toolkit.gp_bloat_diag Issue........................................................................................49

    Chapter 4: Greenplum Database Installation Guide...............................51Introduction to Greenplum.................................................................................................................52

    The Greenplum Master...........................................................................................................52The Segments........................................................................................................................ 53The Interconnect.....................................................................................................................54ETL Hosts for Data Loading.................................................................................................. 55Greenplum Performance Monitoring...................................................................................... 56

    Estimating Storage Capacity............................................................................................................. 57Calculating Usable Disk Capacity.......................................................................................... 57Calculating User Data Size.................................................................................................... 57Calculating Space Requirements for Metadata and Logs......................................................58

    Configuring Your Systems and Installing Greenplum....................................................................... 59System Requirements.............................................................................................................59Setting the Greenplum Recommended OS Parameters........................................................ 61

  • Contents OpenTopic

    4

    Creating the Greenplum Database Administrative User Account.......................................... 65Installing the Greenplum Database Software.........................................................................66Installing and Configuring Greenplum on all Hosts................................................................67Installing Oracle Compatibility Functions............................................................................... 69Installing Optional Modules.................................................................................................... 69Installing Greenplum Database Extensions........................................................................... 70Installing and Configuring the Greenplum Platform Extension Framework (PXF)..................70Creating the Data Storage Areas...........................................................................................70Synchronizing System Clocks................................................................................................ 72Enabling iptables.................................................................................................................... 72Amazon EC2 Configuration (Amazon Web Services)............................................................75Next Steps.............................................................................................................................. 80

    Installing the Data Science Packages.............................................................................................. 81Python Data Science Module Package..................................................................................81R Data Science Library Package........................................................................................... 83

    Validating Your Systems................................................................................................................... 86Validating OS Settings............................................................................................................86Validating Hardware Performance..........................................................................................86Validating Disk I/O and Memory Bandwidth...........................................................................87

    Initializing a Greenplum Database System.......................................................................................89Overview................................................................................................................................. 89Initializing Greenplum Database.............................................................................................89Setting Greenplum Environment Variables............................................................................ 92Next Steps.............................................................................................................................. 93

    Configuring Timezone and Localization Settings..............................................................................95Configuring the Timezone...................................................................................................... 95About Locale Support in Greenplum Database..................................................................... 95Character Set Support............................................................................................................97Setting the Character Set.......................................................................................................99Character Set Conversion Between Server and Client........................................................100

    About Implicit Text Casting in Greenplum Database......................................................................103Workaround: Manually Creating Missing Operators.............................................................104

    Installation Management Utilities.....................................................................................................107Greenplum Environment Variables................................................................................................. 108

    Required Environment Variables..........................................................................................108Optional Environment Variables........................................................................................... 108

    Chapter 6: Greenplum Database Administrator Guide........................ 110Greenplum Database Concepts...................................................................................................... 111

    About the Greenplum Architecture....................................................................................... 111About Management and Monitoring Utilities........................................................................ 114About Concurrency Control in Greenplum Database...........................................................115About Parallel Data Loading................................................................................................ 123About Redundancy and Failover in Greenplum Database...................................................123About Database Statistics in Greenplum Database............................................................. 126

    Managing a Greenplum System..................................................................................................... 132About the Greenplum Database Release Version Number................................................. 132Starting and Stopping Greenplum Database....................................................................... 132Accessing the Database.......................................................................................................135Configuring the Greenplum Database System.....................................................................143Enabling High Availability and Data Consistency Features................................................. 154Backing Up and Restoring Databases................................................................................. 172Expanding a Greenplum System..........................................................................................226Migrating Data...................................................................................................................... 241Monitoring a Greenplum System..........................................................................................252

  • Contents OpenTopic

    5

    Routine System Maintenance Tasks....................................................................................272Recommended Monitoring and Maintenance Tasks............................................................ 276

    Managing Greenplum Database Access.........................................................................................284Configuring Client Authentication......................................................................................... 284Managing Roles and Privileges............................................................................................310

    Defining Database Objects..............................................................................................................317Creating and Managing Databases......................................................................................317Creating and Managing Tablespaces...................................................................................319Creating and Managing Schemas........................................................................................322Creating and Managing Tables............................................................................................ 323Choosing the Table Storage Model..................................................................................... 326Partitioning Large Tables......................................................................................................337Creating and Using Sequences........................................................................................... 349Using Indexes in Greenplum Database............................................................................... 352Creating and Managing Views............................................................................................. 355

    Distribution and Skew..................................................................................................................... 356Local (Co-located) Joins.......................................................................................................356Data Skew............................................................................................................................ 356Processing Skew.................................................................................................................. 357

    Inserting, Updating, and Deleting Data...........................................................................................360About Concurrency Control in Greenplum Database...........................................................360Inserting Rows...................................................................................................................... 361Updating Existing Rows........................................................................................................362Deleting Rows.......................................................................................................................362Working With Transactions...................................................................................................362Vacuuming the Database..................................................................................................... 364

    Querying Data................................................................................................................................. 365About Greenplum Query Processing....................................................................................365About GPORCA....................................................................................................................368Defining Queries................................................................................................................... 381WITH Queries (Common Table Expressions)......................................................................391Using Functions and Operators............................................................................................394Working with JSON Data..................................................................................................... 403Working with XML Data........................................................................................................407Query Performance.............................................................................................................. 419Managing Spill Files Generated by Queries........................................................................ 419Query Profiling...................................................................................................................... 419

    Working with External Data.............................................................................................................425Defining External Tables...................................................................................................... 425Accessing External Data with PXF...................................................................................... 443Accessing HDFS Data with gphdfs...................................................................................... 444Using the Greenplum Parallel File Server (gpfdist)..............................................................467

    Loading and Unloading Data.......................................................................................................... 471Loading Data Using an External Table................................................................................ 472Loading and Writing Non-HDFS Custom Data.................................................................... 472Handling Load Errors............................................................................................................475Loading Data with gpload.....................................................................................................477Transforming External Data with gpfdist and gpload........................................................... 478Loading Data with COPY..................................................................................................... 488Running COPY in Single Row Error Isolation Mode............................................................488Optimizing Data Load and Query Performance................................................................... 488Unloading Data from Greenplum Database......................................................................... 489Formatting Data Files........................................................................................................... 491Example Custom Data Access Protocol.............................................................................. 494

    Managing Performance................................................................................................................... 501Defining Database Performance.......................................................................................... 501

  • Contents OpenTopic

    6

    Common Causes of Performance Issues............................................................................ 502Greenplum Database Memory Overview............................................................................. 505Managing Resources............................................................................................................508Investigating a Performance Problem.................................................................................. 536

    Chapter 8: Greenplum Database Security Configuration Guide......... 539Securing the Database....................................................................................................................540Greenplum Database Ports and Protocols..................................................................................... 541Configuring Client Authentication.................................................................................................... 545

    Allowing Connections to Greenplum Database....................................................................545Editing the pg_hba.conf File.................................................................................................546Authentication Methods........................................................................................................ 547SSL Client Authentication.....................................................................................................550PAM Based Authentication...................................................................................................552Radius Authentication...........................................................................................................552Limiting Concurrent Connections......................................................................................... 552Encrypting Client/Server Connections..................................................................................553

    Configuring Database Authorization................................................................................................555Access Permissions and Roles............................................................................................555Managing Object Privileges..................................................................................................555Using SSH-256 Encryption...................................................................................................556Restricting Access by Time..................................................................................................558Dropping a Time-based Restriction.................................................................................... 560

    Greenplum Command Center Security........................................................................................... 561Auditing............................................................................................................................................ 564Encrypting Data and Database Connections.................................................................................. 569

    Encrypting gpfdist Connections............................................................................................ 569Encrypting Data at Rest with pgcrypto.................................................................................570

    Enabling gphdfs Authentication with a Kerberos-secured Hadoop Cluster.....................................578Prerequisites......................................................................................................................... 578Configuring the Greenplum Cluster......................................................................................578Creating and Installing Keytab Files.................................................................................... 579Configuring gphdfs for Kerberos.......................................................................................... 580Testing Greenplum Database Access to HDFS...................................................................581Troubleshooting HDFS with Kerberos..................................................................................582

    Security Best Practices................................................................................................................... 584

    Chapter 10: Greenplum Database Best Practices................................ 588Best Practices Summary.................................................................................................................589System Configuration...................................................................................................................... 595Schema Design............................................................................................................................... 600

    Data Types........................................................................................................................... 600Storage Model...................................................................................................................... 600Compression......................................................................................................................... 601Distributions.......................................................................................................................... 602Partitioning............................................................................................................................ 605Indexes..................................................................................................................................607Column Sequence and Byte Alignment............................................................................... 607

    Memory and Resource Management with Resource Queues........................................................ 609System Monitoring and Maintenance..............................................................................................613

    Monitoring............................................................................................................................. 613Updating Statistics with ANALYZE.......................................................................................614Managing Bloat in the Database..........................................................................................615Monitoring Greenplum Database Log Files..........................................................................619

  • Contents OpenTopic

    7

    Loading Data................................................................................................................................... 621INSERT Statement with Column Values..............................................................................621COPY Statement.................................................................................................................. 621External Tables.....................................................................................................................621External Tables with Gpfdist................................................................................................ 621Gpload...................................................................................................................................622Best Practices.......................................................................................................................623

    Migrating Data with gptransfer........................................................................................................ 624Security............................................................................................................................................ 630Encrypting Data and Database Connections.................................................................................. 633Tuning SQL Queries....................................................................................................................... 642

    How to Generate Explain Plans........................................................................................... 642How to Read Explain Plans................................................................................................. 642Optimizing Greenplum Queries............................................................................................ 644

    High Availability............................................................................................................................... 646Disk Storage......................................................................................................................... 646Master Mirroring....................................................................................................................646Segment Mirroring................................................................................................................ 647Dual Clusters........................................................................................................................ 648Backup and Restore.............................................................................................................648Detecting Failed Master and Segment Instances................................................................ 649Segment Mirroring Configuration..........................................................................................650

    Chapter 12: Greenplum Database Utility Guide................................... 656Management Utility Reference........................................................................................................ 657

    Backend Server Programs................................................................................................... 658analyzedb..............................................................................................................................659gpactivatestandby................................................................................................................. 663gpaddmirrors......................................................................................................................... 665gpbackup...............................................................................................................................668gpcheck.................................................................................................................................673gpcheckcat............................................................................................................................ 675gpcheckperf...........................................................................................................................677gpconfig.................................................................................................................................680gpcrondump.......................................................................................................................... 684gpdbrestore........................................................................................................................... 697gpcopy...................................................................................................................................705gpdeletesystem..................................................................................................................... 713gpexpand.............................................................................................................................. 714gpfdist....................................................................................................................................717gpfilespace............................................................................................................................ 721gpinitstandby......................................................................................................................... 724gpinitsystem.......................................................................................................................... 726gpload................................................................................................................................... 733gplogfilter...............................................................................................................................743gpmapreduce........................................................................................................................ 746gpmfr..................................................................................................................................... 747gpmovemirrors...................................................................................................................... 751gpperfmon_install..................................................................................................................752gppkg.................................................................................................................................... 756gprecoverseg........................................................................................................................ 758gpreload................................................................................................................................ 763gprestore............................................................................................................................... 765gpscp.....................................................................................................................................769gpseginstall........................................................................................................................... 771

  • Contents OpenTopic

    8

    gpssh.....................................................................................................................................773gpssh-exkeys........................................................................................................................ 775gpstart................................................................................................................................... 778gpstate.................................................................................................................................. 780gpstop................................................................................................................................... 783gpsys1...................................................................................................................................786gptransfer.............................................................................................................................. 787pgbouncer............................................................................................................................. 799pgbouncer.ini.........................................................................................................................800pgbouncer-admin.................................................................................................................. 812

    Client Utility Reference....................................................................................................................822Client Utility Summary.......................................................................................................... 822

    Additional Supplied Modules........................................................................................................... 872citext Data Type................................................................................................................... 872dblink Functions....................................................................................................................874hstore Functions................................................................................................................... 875Oracle Compatibility Functions.............................................................................................878passwordcheck..................................................................................................................... 899

    Chapter 14: Greenplum Database Reference Guide............................ 901SQL Command Reference..............................................................................................................902

    SQL Syntax Summary..........................................................................................................904ABORT..................................................................................................................................931ALTER AGGREGATE...........................................................................................................932ALTER CONVERSION......................................................................................................... 933ALTER DATABASE.............................................................................................................. 934ALTER DOMAIN...................................................................................................................935ALTER EXTENSION.............................................................................................................937ALTER EXTERNAL TABLE..................................................................................................939ALTER FILESPACE............................................................................................................. 941ALTER FUNCTION...............................................................................................................942ALTER GROUP.................................................................................................................... 944ALTER INDEX...................................................................................................................... 945ALTER LANGUAGE............................................................................................................. 946ALTER OPERATOR............................................................................................................. 947ALTER OPERATOR CLASS................................................................................................ 948ALTER OPERATOR FAMILY...............................................................................................948ALTER PROTOCOL............................................................................................................. 951ALTER RESOURCE GROUP.............................................................................................. 952ALTER RESOURCE QUEUE...............................................................................................954ALTER ROLE....................................................................................................................... 957ALTER SCHEMA..................................................................................................................960ALTER SEQUENCE............................................................................................................. 961ALTER TABLE......................................................................................................................963ALTER TABLESPACE..........................................................................................................973ALTER TRIGGER.................................................................................................................974ALTER TYPE........................................................................................................................975ALTER USER....................................................................................................................... 975ALTER VIEW........................................................................................................................ 976ANALYZE..............................................................................................................................977BEGIN................................................................................................................................... 980CHECKPOINT.......................................................................................................................982CLOSE.................................................................................................................................. 982CLUSTER............................................................................................................................. 983COMMENT............................................................................................................................984

  • Contents OpenTopic

    9

    COMMIT................................................................................................................................986COPY.................................................................................................................................... 987CREATE AGGREGATE........................................................................................................997CREATE CAST...................................................................................................................1001CREATE CONVERSION.................................................................................................... 1004CREATE DATABASE......................................................................................................... 1005CREATE DOMAIN..............................................................................................................1006CREATE EXTENSION........................................................................................................1008CREATE EXTERNAL TABLE.............................................................................................1009CREATE FUNCTION..........................................................................................................1018CREATE GROUP............................................................................................................... 1024CREATE INDEX................................................................................................................. 1025CREATE LANGUAGE........................................................................................................ 1028CREATE OPERATOR........................................................................................................ 1030CREATE OPERATOR CLASS........................................................................................... 1034CREATE OPERATOR FAMILY..........................................................................................1038CREATE PROTOCOL........................................................................................................ 1038CREATE RESOURCE GROUP......................................................................................... 1040CREATE RESOURCE QUEUE..........................................................................................1042CREATE ROLE.................................................................................................................. 1046CREATE RULE...................................................................................................................1050CREATE SCHEMA.............................................................................................................1052CREATE SEQUENCE........................................................................................................ 1053CREATE TABLE.................................................................................................................1056CREATE TABLE AS...........................................................................................................1067CREATE TABLESPACE.....................................................................................................1071CREATE TRIGGER............................................................................................................1072CREATE TYPE...................................................................................................................1074CREATE USER.................................................................................................................. 1079CREATE VIEW................................................................................................................... 1080DEALLOCATE.................................................................................................................... 1082DECLARE........................................................................................................................... 1082DELETE.............................................................................................................................. 1085DISCARD............................................................................................................................ 1087DO.......................................................................................................................................1087DROP AGGREGATE..........................................................................................................1089DROP CAST.......................................................................................................................1090DROP CONVERSION........................................................................................................ 1091DROP DATABASE............................................................................................................. 1091DROP DOMAIN.................................................................................................................. 1092DROP EXTENSION............................................................................................................1093DROP EXTERNAL TABLE.................................................................................................1094DROP FILESPACE.............................................................................................................1094DROP FUNCTION..............................................................................................................1095DROP GROUP................................................................................................................... 1096DROP INDEX..................................................................................................................... 1096DROP LANGUAGE.............................................................................................................1097DROP OPERATOR............................................................................................................ 1098DROP OPERATOR CLASS............................................................................................... 1099DROP OPERATOR FAMILY.............................................................................................. 1100DROP OWNED...................................................................................................................1101DROP PROTOCOL............................................................................................................ 1101DROP RESOURCE GROUP..............................................................................................1102DROP RESOURCE QUEUE.............................................................................................. 1103DROP ROLE.......................................................................................................................1104DROP RULE.......................................................................................................................1105

  • Contents OpenTopic

    10

    DROP SCHEMA................................................................................................................. 1106DROP SEQUENCE............................................................................................................ 1106DROP TABLE..................................................................................................................... 1107DROP TABLESPACE.........................................................................................................1108DROP TRIGGER................................................................................................................ 1109DROP TYPE....................................................................................................................... 1109DROP USER...................................................................................................................... 1110DROP VIEW....................................................................................................................... 1111END.....................................................................................................................................1111EXECUTE........................................................................................................................... 1112EXPLAIN............................................................................................................................. 1113FETCH................................................................................................................................ 1115GRANT................................................................................................................................1118INSERT............................................................................................................................... 1122LOAD.................................................................................................................................. 1124LOCK.................................................................................................................................. 1124MOVE..................................................................................................................................1127PREPARE........................................................................................................................... 1128REASSIGN OWNED...........................................................................................................1130REINDEX............................................................................................................................ 1131RELEASE SAVEPOINT......................................................................................................1132RESET................................................................................................................................ 1133REVOKE............................................................................................................................. 1134ROLLBACK......................................................................................................................... 1136ROLLBACK TO SAVEPOINT.............................................................................................1136SAVEPOINT........................................................................................................................1137SELECT.............................................................................................................................. 1139SELECT INTO.................................................................................................................... 1153SET..................................................................................................................................... 1154SET ROLE.......................................................................................................................... 1156SET SESSION AUTHORIZATION..................................................................................... 1157SET TRANSACTION.......................................................................................................... 1158SHOW................................................................................................................................. 1160START TRANSACTION..................................................................................................... 1161TRUNCATE.........................................................................................................................1162UPDATE..............................................................................................................................1163VACUUM.............................................................................................................................1166VALUES.............................................................................................................................. 1169

    SQL 2008 Optional Feature Compliance......................................................................................1171Greenplum Environment Variables............................................................................................... 1200

    Required Environment Variables........................................................................................1200Optional Environment Variables......................................................................................... 1200

    System Catalog Reference........................................................................................................... 1202System Tables.................................................................................................................... 1202System Views..................................................................................................................... 1203System Catalogs Definitions...............................................................................................1204

    The gp_toolkit Administrative Schema..........................................................................................1289Checking for Tables that Need Routine Maintenance........................................................1289Checking for Locks.............................................................................................................1290Checking Append-Optimized Tables.................................................................................. 1292Viewing Greenplum Database Server Log Files................................................................ 1296Checking Server Configuration Files..................................................................................1299Checking for Failed Segments........................................................................................... 1300Checking Resource Group Activity and Status.................................................................. 1301Checking Resource Queue Activity and Status................................................................. 1303Checking Query Disk Spill Space Usage...........................................................................1305

  • Contents OpenTopic

    11

    Viewing Users and Groups (Roles)....................................................................................1307Checking Database Object Sizes and Disk Space............................................................ 1308Checking for Uneven Data Distribution.............................................................................. 1312

    The gpperfmon Database..............................................................................................................1313database_*.........................................................................................................................1315diskspace_*....................................................................................................................... 1316interface_stats_*................................................................................................................ 1316log_alert_*..........................................................................................................................1318queries_*............................................................................................................................. 1319segment_*..........................................................................................................................1321socket_stats_*.....................................................................................................................1322system_*............................................................................................................................. 1323dynamic_memory_info........................................................................................................ 1325memory_info...................................................................................................................... 1325

    Greenplum Database Data Types.................................................................................................1327Character Set Support...................................................................................................................1331

    Setting the Character Set...................................................................................................1333Character Set Conversion Between Server and Client...................................................... 1333

    Server Configuration Parameters..................................................................................................1336Parameter Types and Values.............................................................................................1336Setting Parameters............................................................................................................. 1336Parameter Categories.........................................................................................................1337Configuration Parameters...................................................................................................1347

    Summary of Built-in Functions...................................................................................................... 1431Greenplum Database Function Types................................................................................1431Built-in Functions and Operators........................................................................................1432JSON Functions and Operators......................................................................................... 1435Window Functions.............................................................................................................. 1438Advanced Aggregate Functions......................................................................................... 1440

    Greenplum MapReduce Specification...........................................................................................1442Greenplum MapReduce Document Format........................................................................1442Greenplum MapReduce Document Schema......................................................................1443Example Greenplum MapReduce Document..................................................................... 1450

    Greenplum PL/pgSQL Procedural Language............................................................................... 1456About Greenplum Database PL/pgSQL............................................................................. 1456PL/pgSQL Plan Caching.....................................................................................................1458PL/pgSQL Examples...........................................................................................................1458References..........................................................................................................................1462

    Greenplum PostGIS Extension..................................................................................................... 1463About PostGIS.................................................................................................................... 1463Enabling and Removing PostGIS Support......................................................................... 1464Usage..................................................................................................................................1465PostGIS Extension Support and Limitations...................................................................... 1466PostGIS Support Scripts.....................................................................................................1467

    Greenplum PL/R Language Extension..........................................................................................1470About Greenplum Database PL/R......................................................................................1470

    Greenplum PL/Python Language Extension................................................................................. 1476About Greenplum PL/Python..............................................................................................1476Enabling and Removing PL/Python support...................................................................... 1476Developing Functions with PL/Python................................................................................1477Installing Python Modules...................................................................................................1480Examples............................................................................................................................ 1483References..........................................................................................................................1485

    Greenplum PL/Container Language Extension.............................................................................1486About the PL/Container Language Extension.................................................................... 1486About PL/Container Resource Management......................................................................1487

  • Contents OpenTopic

    12

    PL/Container Docker Images............................................................................................. 1489Prerequisites....................................................................................................................... 1489Installing the PL/Container Language Extension............................................................... 1490Installing PL/Container Docker Images.............................................................................. 1493Uninstalling PL/Container................................................................................................... 1494Using PL/Container Functions............................................................................................1495About PL/Container Running PL/Python............................................................................ 1497About PL/Container Running PL/R.....................................................................................1498Configuring PL/Container....................................................................................................1498Installing Docker................................................................................................................. 1508References..........................................................................................................................1510

    Greenplum PL/Java Language Extension.....................................................................................1511About PL/Java.................................................................................................................... 1511About Greenplum Database PL/Java.................................................................................1512Installing PL/Java................................................................................................................1513Uninstalling PL/Java........................................................................................................... 1514Enabling PL/Java and Installing JAR Files........................................................................ 1515Writing PL/Java functions................................................................................................... 1515Using JDBC........................................................................................................................ 1521Exception Handling.............................................................................................................1521Savepoints.......................................................................................................................... 1521Logging............................................................................................................................... 1522Security............................................................................................................................... 1522Some PL/Java Issues and Solutions..................................................................................1523Example.............................................................................................................................. 1524References..........................................................................................................................1525

    Greenplum PL/Perl Language Extension......................................................................................1526About Greenplum PL/Perl...................................................................................................1526Greenplum Database PL/Perl Limitations.......................................................................... 1526Trusted/Untrusted Language.............................................................................................. 1526Enabling and Removing PL/Perl Support...........................................................................1527Developing Functions with PL/Perl.....................................................................................1527

    Greenplum MADlib Extension for Analytics.................................................................................. 1531About MADlib......................................................................................................................1531Installing MADlib................................................................................................................. 1531Upgrading MADlib...............................................................................................................1532Uninstalling MADlib.............................................................................................................1533Examples............................................................................................................................ 1533References..........................................................................................................................1539

    Greenplum Partner Connector API............................................................................................... 1541Using the GPPC API..........................................................................................................1541Building a GPPC Shared Library with PGXS.....................................................................1553Registering a GPPC Function with Greenplum Database................................................. 1553Packaging and Deployment Considerations.......................................................................1554GPPC Text Function Example........................................................................................... 1555GPPC Set-Returning Function Example............................................................................ 1557

    Greenplum Fuzzy String Match Extension....................................................................................1561Soundex Functions............................................................................................................. 1561Levenshtein Functions........................................................................................................1562Metaphone Functions......................................................................................................... 1562Double Metaphone Functions.............................................................................................1563Installing and Uninstalling the Fuzzy String Match Functions............................................ 1563

    Summary of Greenplum Features.................................................................................................1564Greenplum SQL Standard Conformance........................................................................... 1564Greenplum and PostgreSQL Compatibility.........................................................................1566

  • Contents OpenTopic

    13

    Chapter 16: Greenplum Database UNIX Client Documentation........ 1575Greenplum Database Client Tools for UNIX.................................................................................1576

    Installing the Greenplum Client Tools................................................................................ 1576Client Tools Reference.......................................................................................................1579

    Greenplum Database Load Tools for UNIX..................................................................................1580Installing the Greenplum Load Tools................................................................................. 1580Load Tools Reference........................................................................................................ 1581

    Chapter 17: Greenplum Database Windows Client Documentation..1583Greenplum Database Client Tools for Windows...........................................................................1584

    Installing the Greenplum Client Tools................................................................................ 1584Running the Greenplum Client Tools.................................................................................1587Client Tools Reference.......................................................................................................1588

    Greenplum Database Load Tools for Windows............................................................................ 1590Installing Greenplum Loader.............................................................................................. 1590Running Greenplum Loader............................................................................................... 1592Running gpfdist as a Windows Service..............................................................................1596Loader Program Reference................................................................................................ 1597

    Chapter 19: DataDirect ODBC Drivers for Pivotal Greenplum...........1598Prerequisites.................................................................................................................................. 1599Supported Client Platforms........................................................................................................... 1600Installing on Linux Systems.......................................................................................................... 1601

    Configuring the Driver on Linux......................................................................................... 1602Testing the Driver Connection on Linux.............................................................................1603

    Installing on Windows Systems.................................................................................................... 1604Verifying the Version on Windows..................................................................................... 1604Configuring and Testing the Driver on Windows................................................................1604

    DataDirect Driver Documentation..................................................................................................1606

    Chapter 20: DataDirect JDBC Driver for Pivotal Greenplum............. 1607Prerequisites.................................................................................................................................. 1608Downloading the DataDirect JDBC Driver.................................................................................... 1609Obtaining Version Details for the Driver....................................................................................... 1610Usage Information......................................................................................................................... 1611DataDirect Driver Documentation..................................................................................................1612

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    14

    Chapter 2

    Pivotal Greenplum 5.10.0 Release Notes

    Updated: July, 2018

    • Welcome to Pivotal Greenplum 5.10.0• New Features• Changed Features• Experimental Features• Differences Compared to Open Source Greenplum Database• Supported Platforms• Pivotal Greenplum Tools and Extensions Compatibility• Hadoop Distribution Compatibility• Upgrading to Greenplum Database 5.10.0• Migrating Data to Pivotal Greenplum 5.x• Pivotal Greenplum on DCA Systems• Resolved Issues• Known Issues and Limitations• Update for gp_toolkit.gp_bloat_diag Issue

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    15

    Welcome to Pivotal Greenplum 5.10.0Pivotal Greenplum Database is a massively parallel processing (MPP) database server that supports nextgeneration data warehousing and large-scale analytics processing. By automatically partitioning dataand running parallel queries, it allows a cluster of servers to operate as a single database supercomputerperforming tens or hundreds times faster than a traditional database. It supports SQL, MapReduce parallelprocessing, and data volumes ranging from hundreds of gigabytes, to hundreds of terabytes.

    This document contains pertinent release information about Pivotal Greenplum Database 5.10.0. Forprevious versions of the release notes for Greenplum Database, go to Pivotal Greenplum DatabaseDocumentation. For information about Greenplum Database end of life, see the Pivotal Support LifecyclePolicy.

    Pivotal Greenplum 5.x software is available for download from the Pivotal Greenplum page on PivotalNetwork.

    Pivotal Greenplum 5.x is based on the open source Greenplum Database project code.

    Important: Pivotal Support does not provide support for open source versions of GreenplumDatabase. Only Pivotal Greenplum Database is supported by Pivotal Support.

    Pivotal Greenplum 5.10.0 is a minor release that adds and changes several features and resolves someissues.

    https://gpdb.docs.pivotal.io/https://gpdb.docs.pivotal.io/https://pivotal.io/support/lifecycle_policyhttps://pivotal.io/support/lifecycle_policyhttps://network.pivotal.io/products/pivotal-gpdbhttps://network.pivotal.io/products/pivotal-gpdbhttp://greenplum.org/

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    16

    New FeaturesGreenplum Database 5.10.0 includes these new features and enhancements.

    • gpcopy Enhancements• Bypass Resource Group Concurrent Transaction Limits• gpload Performance Enhancement• gpbackup S3 Plugin Enhancements• Storage Plugin API Execution Scope• Filter Pushdown for External Table Protocols• Pivotal Greenplum-Kafka Connector (Experimental)

    gpcopy EnhancementsIn Greenplum 5.10.0, the gpcopy utility supports these options.

    • --dry-run - When you specify this option, gpcopy generates a list of the migration operations thatwould have been performed with the gpcopy options you specify. The data is not migrated.

    The information is displayed at the command line and written to the log file.• --no-distribution-check - Specify this option to disable table data distribution checking. In

    Greenplum 5.10.0, gpcopy performs data distribution checking to ensure data is distributed to segmentinstances correctly.

    Note: The utility does not support table data distribution checking when copying a partitionedtable that is defined with a leaf table that is an external table or if a leaf table is defined with adistribution policy that is different from the root partitioned table.

    Warning: Before you perform a gpcopy operation with the --no-distribution-checkoption, ensure that you have a backup of the destination database and that the distributionpolicies of the tables that are being copied are the same in the source and destination database.Copying data into segment instances with incorrect data distribution can cause incorrect queryresults and can cause database corruption.

    The gpcopy utility copies objects from databases in a source Greenplum Database system to databasesin a destination Greenplum Database system. For information about gpcopy, see gpcopy in the PivotalGreenplum Database Documentation.

    Bypass Resource Group Concurrent Transaction LimitsThe Greenplum Database server configuration parameter gp_resource_group_bypass enablesor disables the enforcement of resource group concurrent transaction limits on Greenplum Databaseresources. The default value is false, enforce resource group transaction limits. Resource groupsmanage resources such as CPU, memory, and the number of concurrent transactions that are used byqueries and external components such as PL/Container.

    Note: This server configuration parameter is enforced only when resource group-based resourcemanagement is active.

    You can set this parameter to true to bypass resource group concurrent transaction limitations so that aquery can run immediately. For example, you can set the parameter to true for a session to run a systemcatalog query or a similar query that requires a minimal amount of resources.

    When you set this parameter to true and a run a query, the query runs in this environment:

    • The query runs inside a resource group. The resource group assignment for the query does notchange.

    https://gpdb.docs.pivotal.io/https://gpdb.docs.pivotal.io/

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    17

    • The query memory quota is approximately 10 MB per query. The memory is allocated from resourcegroup shared memory or global shared memory. The query fails if there is not enough shared memoryavailable to fulfill the memory allocation request.

    This parameter can be set for a session. The parameter cannot be set within a transaction or a function.

    Value Range Default Set Classifications

    Boolean false session

    gpload Performance EnhancementThe Greenplum Database gpload data loading utility supports the configuration file parameterFAST_MATCH that can improve gpload performance. When FAST_MATCH is set to true and the utilityreuses external tables, gpload only searches Greenplum Database for matching external tables for reuse.The utility does not check the external table column names and column types to ensure that the table canbe used for a gpload operation. This can improve gpload performance when the utility reuses externaltable and the database catalog table pg_attribute contains a large number of rows.

    To reuse external table objects and staging table objects , REUSE_TABLES: true must also be specifiedin the gpload configuration file. If REUSE_TABLES is false or not specified and FAST_MATCH: true isspecified, gpload returns a warning message.

    The FAST_MATCH default value is false, the utility checks the external table definition column names andcolumn types. The utility returns an error and quits if the column definitions are not compatible.

    Note: If fast_match: true is specified in the gpload configuration file, the utility ignores thevalue of SCHEMA in the EXTERNAL section if the SCHEMA value is specified in the file. The utilityuses the Greenplum Database default schema. The SCHEMA value specifies the schema of theexternal table database objects created by gpload.

    For information about gpload, see gpload in the Pivotal Greenplum Database Documentation.

    gpbackup S3 Plugin EnhancementsThe gpbackup S3 plugin (an experimental feature) supports these new configuration options for S3compatible data sources:

    • endpoint - Required if connecting to an S3 compatible service. Specify this option to connect to an S3compatible service such as ECS. The plugin connects to the specified S3 endpoint (hostname or IPaddress) to access the S3 compatible data store.

    If this option is specified, the plugin ignores the region option and does not use AWS to resolve theendpoint. When this option is not specified, the plugin uses the region to determine the AWS S3endpoint.

    • encryption Optional. Enable or disable the use of Secure Sockets Layer (SSL) when connecting to anS3 location. The default value is on, use connections that are secured with SSL. Set this option to offto connect to an S3 compatible service that is not configured to use SSL.

    Any value other than off is treated as on.

    For information about the options, see Using the S3 Storage Plugin with gpbackup and gprestore in thePivotal Greenplum Database Documentation.

    Storage Plugin API Execution ScopeThe Storage plugin framework API (an experimental feature) provides new arguments with a plugin setupor cleanup command that specify the execution scope (master host, segment host, or segment instance).The scope can be one of these values.

    • master - Execute the plugin command once on the master host.

    https://gpdb.docs.pivotal.io/https://gpdb.docs.pivotal.io/

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    18

    • segment_host - Execute the plugin command once on each of the segment hosts.• segment - Execute the plugin command once for each active segment instance on the host running the

    segment instance.

    The Greenplum Database hosts and segment instances are based on the Greenplum Databaseconfiguration when the back up started.

    For information about the API, see Backup/Restore Storage Plugin API in the Pivotal Greenplum DatabaseDocumentation.

    Filter Pushdown for External Table ProtocolsGreenplum Database 5.10.0 introduces the gp_external_enable_filter_pushdown serverconfiguration parameter to enable or disable predicate filter pushdown for external table protocols, such aspxf. Filter pushdown can improve query performance by reducing the amount of data transferred betweenthe external data source and Greenplum Database. This parameter is set to off by default; set it to on toenable filter pushdown.

    Keep in mind that some data sources do not support filter pushdown. Also, filter pushdown may not besupported with certain data types or operators. If a query accesses a data source that does not supportfilter push-down for the query constraints, the query is instead executed without filter pushdown (the data isfirst transferred to Greenplum Database and then filtered).

    Pivotal Greenplum Database supports filter pushdown only with the PXF Hive connector.

    For more information about filter pushdown, see Using PXF to Read and Write External Data in the PivotalGreenplum Database Documentation.

    Pivotal Greenplum-Kafka Connector (Experimental)Greenplum Database 5.10 provides integration with the new Pivotal Greenplum-Kafka Connector(experimental). The Pivotal Greenplum-Kafka Connector provides high speed, parallel data transfer froma Kafka cluster to a Pivotal Greenplum Database cluster for batch and streaming ETL operations. Referto the Pivotal Greenplum-Kafka Connector (Experimental) documentation for more information about thisfeature.

    https://gpdb.docs.pivotal.io/https://gpdb.docs.pivotal.io/../pxf/using_pxf.htmlhttps://gpdb.docs.pivotal.io/https://gpdb.docs.pivotal.io/../greenplum-kafka/intro.html

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    19

    Changed FeaturesGreenplum Database 5.10.0 includes these changed features.

    • By default, the gpcopy utility performs data distribution checking to ensure data is distributed tosegment instances correctly when copying data from a source Greenplum Database system to adestination Greenplum Database system. If distribution checking fails, the table copy fails. The gpcopyoption --no-distribution-check disables distribution checking. See New Features.

    • The Greenplum Database gphdfs protocol was tested to confirm support for Parquet versions 1.7.0and later. Previous versions of the documentation incorrectly stated that only Parquet version 1.7.0 wassupported.

    • The Pivotal Greenplum-Informatica Connector is no longer an experimental feature as of version 1.0.3.The connector supports high speed data transfer from an Informatica PowerCenter cluster to a PivotalGreenplum Database cluster for batch and streaming ETL operations. See the Pivotal Greenplum-Informatica Connector documentation.

    • The PgBouncer connection pooler utility that ships with Greenplum Database 5.10.0 has been updatedto resolve a known issue. See resolved issue 29347. PgBouncer 1.8.1 provides native TLS and PAMsupport and pg_hba.conf-compatible configuration. For information about PgBouncer, see Using thePgBouncer Connection Pooler.

    • Concurrent VACUUM operations on the same append-optimized table are blocked. In previous 5.xreleases, concurrent VACUUM operations on the same append-optimized table are allowed.

    https://greenplum-informatica.docs.pivotal.iohttps://greenplum-informatica.docs.pivotal.io

  • Pivotal Greenplum 5.10.0 Release Notes OpenTopic

    20

    Experimental FeaturesBecause Pivotal Greenplum Database is based on the open source Greenplum Database project code,it includes several experimental features to allow interested developers to experiment with their use ondevelopment systems. Feedback will help drive development of these features, and they may becomesupported in future versions of the product.

    Warning: Experimental features are not recommended or supported for production deployments.These features may change in or be removed from future versions of the product based on furthertesting and feedback. Moreover, any features that may be visible in the open source code butthat are not described in the product documentation should be considered experimental andunsupported for production use.

    Greenplum Database 5.10.0 includes these experimental features:

    • Storage plugins for gpbackup and gprestore.

    • The DD Boost storage plugin. You can specify the --plugin-config option to store a backup ona Dell EMC Data Domain storage appliance, and restore the data from the appliance. You can alsoreplicate a backup on a separate, remote Data Domain system for disaster recovery.

    • The S3 storage plugin. You can specify the --plugin-config option to store a backup on anAmazon Web Services S3 location, and restore the data from the S3 location. For GreenplumDatabase 5.10.0 the S3 plugin also supports S3 compatible data stores. See New Features.

    • Storage plugin framework API. Partners, customers, and OSS developers can develop pluginsto use in conjunction with gpbackup and gprestore. For Greenplum Database 5.10.0 the APIsupports a new argument for execution scope. See New Features.

    For information about storage plugins and the storage plugin API, see Using gpbacku