oracle 11g partitioning for developers - database...

70
Lewis Cunningham Senior Solutions Architect EnterpriseDB [email protected] Oracle 11g Partitioning for Developers

Upload: lyliem

Post on 28-Jul-2018

242 views

Category:

Documents


0 download

TRANSCRIPT

Lewis Cunningham

Senior Solutions Architect

EnterpriseDB

[email protected]

Oracle 11g Partitioning

for Developers

Agenda

• Audience

• What is Partitioning?

• When Should I use Partitioning?

• How do I use Partitioning?

• Summary

Audience

Audience

• The changing nature of database

administration and development

• Designer = Developer?

• Anyone who needs performance on large

data sets

• Performance vs Maintenance and

Availability

What is Partitioning?

What is Partitioning?

• A partition physically separates data

• Tables AND indexes can be partitioned

• Partitions improve performance(*)

• Partitioning is transparent to applications

What is Partitioning?

• This presentation covers Oracle only

• Oracle offers extremely robust partitioning

• The optimizer recognizes partitioned

objects

What is Partitioning?

• Data file

– Physical OS disk file that stores data

– All data in an Oracle database ends up in a

data file

– A tablespace may be made up of 1 or more

data files

– This allows data to be spread across multiple

physical or logical disks for performance

What is Partitioning?

• Segment

– A segment is a logical container for an object

in an Oracle database

– A normal table (i.e. non partitioned) is exactly

one segment

– A regular index is exactly one a segment

– A partitioned table will be made up of as

many segments as it has partitions

– The same is true of partitioned indexes

What is Partitioning?

• Tablespace

– A tablespace is a logical container for

segments.

– A tablespace may be empty but it will most

likely hold one or more segments.

– So, a tablespace is logically made up of

segments and physically made up of data files

What is Partitioning?

• Data Blocks & Extents

– Not really pertinent to a discussion of

partitioning.

– A data block is a physical chunk of data

stored in a data file and an extent is a set of

contiguous data blocks

– A segment is made up of extents.

– A segment can span multiple data files but an

extent must be contained within a single data

file

What is Partitioning?

• A normal table (non-partitioned) is a single

segment

• A partitioned table stores one partition per

segment

• A normal index is a single segment

• A partitioned index stores one partition per

segment

• Each segment may exist in a different

tablespace

What is Partitioning?

• How does all of this help me?

– A scan, of a table or an index, scans the

entire segment

– 1 Billion rows in a table means 1 billion rows

to scan

– 1 Billion rows partitioned in 10 partitions

*MIGHT* mean only 100 million rows to be

scanned

– You can have more than 10 partitions for

even better performance (maybe)

What is Partitioning?

True or False – Partitioning is only useful if

each partition is in its own tablespace?

False - This is a myth. Putting each in its

own tablespace can help administration

and management but you can still get

serious gains without that.

What is Partitioning?

• Partitioning is an important aspect of

designing your application

• Not so long ago, 1 million records was a

lot

• 1 billion records is becoming normal

• The 10 GB database of yesterday is the

10TB database of today

• Yahoo has an active 150PB database

What is Partitioning?

• Hybrid databases, OLTP + Reporting, are

becoming more common rather than less

• Is this a good idea? Doesn't matter. We

need to deal with it

• Partitioning is useful in OLTP databases

What is Partitioning?

• How does Oracle use partitioning?

– Partition Pruning is the Key

• Partition pruning let's Oracle only scan those

partitions that actually hold the data requested

• Partition pruning is based on a Partition Key

• The partition key identifies the partition where data

will – does – reside

• Identifying the correct partition key is critical to

useful partitioning

What is Partitioning?

• Partition Keys

– Can be, but rarely is, the primary key

– A primary key uniquely identified a row of data

– A partition key identifies a category of data

– Partition keys should be important to the

majority of data requests against a table

– In addition to the partition type, a partition key

helps decide the number of partitions

What is Partitioning?

How to identify a partition key

• How does the business look at the data?

• What type of partitioning makes the most

sense?

• In retail, a sale_date might make sense

• If geography is important, a region,

country or state might be a good choice

What is Partitioning?

Partition key = sale_date

On Insert, Oracle peeks at

the sale_date value and

inserts the row into the

correct partition.

On select, the optimizer

peeks at the sale_date and

compares that to the

partition key, only partitions

where the data COULD

reside will be scanned

What is Partitioning?

• Identifying the keys means you must

identify key data

• If you choose region, you must identify

which regions will live in which partition

• If you choose sale_date, will it be the year,

month-year, which years, etc

• Your partition key and partition values are

tightly couple to partition types

What is Partitioning?

Note: Oracle 11g allows you to use Virtual

Columns as partition keys, prior versions

did not.

A virtual column is an expression rather than

a specific column in the table. For

example, you can add to dates, or

consolidate using last_day(order_date) or

even substring out a piece of a column

and that is a virtual column.

What is Partitioning?

• Partition Types in 11g

– List Partition

– Range Partition

– Hash Partition

– Reference Partition (new and very cool!)

– System Partition

– Interval Partition

– Composite Partition

What is Partitioning?

• List Partition

– Very simple to implement and understand

– If you data is list or would make a good

lookup value

– Examples: state, region, sic code, color,

product type, industry code, make, model

– You must enumerate the list

– Values must be unambiguous

What is Partitioning?

List Partition Pseudo Code

Partition by LIST (state_column)

Partition 1 stores ('AZ', 'AK', 'AL')

Partition 2 stores ('NM', 'NY', 'MD')

Partition 3 stores ('LA', 'TX', 'MS')

Partition 4 stores (NULL VALUES) Partition 5 stores (ALL OTHER VALUES)

What is Partitioning?

List Partition Real Code

PARTITION BY LIST (state_column)

(

PARTITION state1 VALUES ('AZ', 'AK', 'AL'),

PARTITION state2 VALUES ('NM', 'NY', 'MD'),

PARTITION state3 VALUES ('LA', 'TX', 'MS'),

PARTITION null_states VALUES (NULL),

PARTITION other_states VALUES (DEFAULT)

)

What is Partitioning?

• Range Partition

– Very common partitioning type

– Like a between clause

– Like a hierarchy of < (less than)

• <5

• <10

• <20 etc

– All values must be enumerated

– Values must be unambiguous

What is Partitioning?

Range Partitioning Pseudo Code

Partition by RANGE (sale_date)

Partition 2005 stores sale_date < 01-jan-2005

Partition 2006 stores sale_date < 01-jan-2006

Partition 2007 stores sale_date < 01-jan-2007

Partition 2008 stores sale_date < 01-jan-2008

Partition infinity stores sale_date < INFINITY

• Order is important

What is Partitioning?

Range Partition Real Code PARTITION BY RANGE (sale_date)

(

PARTITION y2005 VALUES LESS THAN

(TO_DATE('01/01/2006', 'DD/MM/YYYY')),

PARTITION y2006 VALUES LESS THAN

(TO_DATE('01/01/2007', 'DD/MM/YYYY')),

PARTITION y2007 VALUES LESS THAN

(TO_DATE('01/01/2008', 'DD/MM/YYYY')),

PARTITION y2008 VALUES LESS THAN

(TO_DATE('01/01/2009', 'DD/MM/YYYY')),

PARTITION yInfinite VALUES LESS THAN (MAXVALUE)

);

What is Partitioning?

• Hash Partition

– Tables with no obvious partition key

– No logical range or list available

– Sequence could be a hash partition key

• Think Customer_id, product_id, etc

• Not as important now that reference partitioning is

available

– Balances data evenly across the number of

partitions defined

– DBAs should like it (for admin reasons)

What is Partitioning?

Hash Partitioning Pseudo Code Partition by hash (seq_pk)

create 3 Partitions

Hash Partitioning Real Code Partition by hash (seq_pk)

partitions 5;

What is Partitioning?

• Reference Partition

– New in 11g

– Foreign Key based partitioning

• Active, enforced FK -> PK required

– Child table is automatically by the parent

partition key, even when the column does not

exist in the child table

– Fast, fast, fast joins

What is Partitioning?

Reference Partitioning Pseudo Code Child_Table Constraint foreign_key_name

Foreign Key relates to parent_table (key)

Partitition by reference (foreign_key_name)

Reference Partitioning Real Code CONSTRAINT order_fk

FOREIGN KEY (order_Id)

REFERENCES orders(order_id) )

PARTITION BY REFERENCE (order_fk);

What is Partitioning?

• System Partition

– Programmatic partitioning

– You have to write all of the code

– Specialized third-party tools or applications

– Not really a day to day usage of partitioning

What is Partitioning?

• Interval Partition

– Special case of Range Partition

– Let Oracle create your partitions for you

– Range partitions can be converted to Interval

• ALTER TABLE SET INTERVAL

– Fix it and Forget it!

What is Partitioning?

Interval Partitioning Pseudo Code

Partition by RANGE (sale_date)

Interval of 1 year

Start with partition 1

sale_date < 01-jan-2005

What is Partitioning?

Interval Partitioning Real Code

Partition by RANGE (sale_date)

INTERVAL (NUMTOYMINTERVAL(1,'year'))

(PARTITION p_year VALUES

LESS THAN TO_DATE('01-JAN-2005') ) );

What is Partitioning?

• Composite Partition

– A composite is just a mix of two partition types

– First partition by high level order

– Follow with inner partition

– Plenty of composite types

• Range Hash

• Range List

• Range Range

• List List

• List Hash

• List Range

What is Partitioning?

Composite (Range Hash) Partitioning

Psuedo Code Partition by Range (sale_date)

Partition 2005 stores sale_date < 01-jan-2005

Subpartition by HASH (sic_cd)

create 5 subpartitions

Partition 2006 stores sale_date < 01-jan-2006

Subpartition by HASH (sic_cd)

create 5 subpartitions

What is Partitioning?

Composite (Range Hash) Partitioning Real

Code Partition by Range (sale_date)

Subpartition by HASH (sic_cd)

subpartitions 5

(

PARTITION y2005 VALUES LESS THAN

(TO_DATE('01/01/2006', 'DD/MM/YYYY')),

PARTITION y2006 VALUES LESS THAN

(TO_DATE('01/01/2007', 'DD/MM/YYYY'))

)

What is Partitioning?

• Indexing Partitions

– Indexes can be partitioned

– Partitioned indexes don't always equate to

better performance

– Three types of indexes

• Global non-partitioned index

• Global partitioned index

• Local index

What is Partitioning?

• Global Non-Partitioned Index

– Index is a single segment

– Your PK is probably a global non-partitioned

index

• If you PK is part of your partition key, it may not be

– This is a regular index like on a non-

partitioned table

– Useful for OLTP but sacrifices management

and availability

What is Partitioning?

• Global Partitioned Index

– Index exists in multiple segments

– Can only be a hash or range

– Partition key is different than table partition

key

– Range Partitioned Index is useful in OLTP

– Hash Partitioned Index is useful for

administration (spreads data out)

What is Partitioning?

• Local Index

– Partitioned by the same partition key as the

table

– Offer excellent through put (DSS and OLAP)

– Offer excellent availability

– Can be unique if partition key is part of unique

index key

– Better for OLAP but can be useful in OLTP

What is Partitioning?

• Oracle's Guidance to choosing an Index

– If the table partitioning column is a subset of

the index keys, use a local index

– If the index is unique and does not include the

partitioning key columns, then use a global

index

– If your priority is manageability, use a local

index

– If the application is an OLTP one and users

need quick response times, use a global

index

When Should I use Partitioning?

When Should I use Partitioning?

• Use partitioning when:

– A table is > 2GB (from Oracle)

– A Table is > 100 Million rows (from me)

• Think about it if table is > 1 million rows

– Partitioning is not free

• Balance your performance needs against your

budget

– If OLTP, can you remove some of the data?

• Data warehouse maybe?

When Should I use Partitioning?

• Use partitioning when:

– If OLAP or Reporting, seriously consider

paying the Partitioning license fee

– Is your data or customer base expected to

grow rapidly?

– Are you running on many CPUs?

• Partitioning can assist in parallelizing many

operations

How do I use Partitioning?

Here comes some code!

How do I use Partitioning?

• I'm going to use an Order Entry system as

an example

• Powerpoint sucks for code display

• I'll do my best

How do I use Partitioning?

How do I use Partitioning?

List Partition CREATE TABLE customers (

cust_id NUMBER NOT NULL PRIMARY KEY,

cust_name VARCHAR2(30),

region_code VARCHAR2(3) )

PARTITION BY LIST (region_code) (

PARTITION asia_region

VALUES ('AS', 'CN', 'JP', 'RS', 'CK'),

PARTITION euro_region

VALUES ('GB', 'FR', 'GR', 'IT', 'SC'),

PARTITION amer_region

VALUES ('US', 'MX', 'CD', 'GU', 'BR'),

PARTITION null_region

VALUES (NULL),

PARTITION error_region

VALUES (DEFAULT) );

How do I use Partitioning?

View the partitions created

SQL> select partition_name, high_value

2 from user_tab_partitions

3 where table_name = 'CUSTOMERS';

PARTITION_NAME HIGH_VALUE

------------------------------ ------------------------------

ASIA_REGION 'AS', 'CN', 'JP', 'RS', 'CK'

EURO_REGION 'GB', 'FR', 'GR', 'IT', 'SC'

AMER_REGION 'US', 'MX', 'CD', 'GU', 'BR'

NULL_REGION NULL

ERROR_REGION DEFAULT

How do I use Partitioning?

Range Partition CREATE TABLE orders (

order_id NUMBER NOT NULL PRIMARY KEY,

order_date DATE NOT NULL,

cust_id NUMBER,

extra_column VARCHAR2(10),

CONSTRAINT ordfk1

FOREIGN KEY (cust_id)

REFERENCES customers(cust_id) )

PARTITION BY RANGE (order_date) (

PARTITION y2005 VALUES LESS THAN (TO_DATE('01/01/2006', 'DD/MM/YYYY')),

PARTITION y2006 VALUES LESS THAN (TO_DATE('01/01/2007', 'DD/MM/YYYY')),

PARTITION y2007 VALUES LESS THAN (TO_DATE('01/01/2008', 'DD/MM/YYYY')),

PARTITION y2008 VALUES LESS THAN (TO_DATE('01/01/2009', 'DD/MM/YYYY')),

PARTITION yInfinite VALUES LESS THAN (MAXVALUE) );

How do I use Partitioning?

SQL> select partition_name, high_value

2 from user_tab_partitions

3 where table_name = 'ORDERS';

PARTITION_NAME HIGH_VALUE

------------------------------ ------------------------------

Y2006 TO_DATE(' 2007-01-01 00:00:00'

, 'SYYYY-MM-DD HH24:MI:SS', 'N

LS_CALENDAR=GREGORIA

Y2007 TO_DATE(' 2008-01-01 00:00:00'

, 'SYYYY-MM-DD HH24:MI:SS', 'N

LS_CALENDAR=GREGORIA

Y2008 TO_DATE(' 2009-01-01 00:00:00'

, 'SYYYY-MM-DD HH24:MI:SS', 'N

LS_CALENDAR=GREGORIA

YINFINITE MAXVALUE

Y2005 TO_DATE(' 2006-01-01 00:00:00'

, 'SYYYY-MM-DD HH24:MI:SS', 'N

LS_CALENDAR=GREGORIA

How do I use Partitioning?

Reference Partition CREATE TABLE order_lines (

order_id NUMBER NOT NULL,

line_num NUMBER NOT NULL,

line_amt NUMBER NOT NULL,

CONSTRAINT ol_pk

PRIMARY KEY (order_id, line_num),

CONSTRAINT order_fk

FOREIGN KEY (order_Id)

REFERENCES orders(order_id)

)

PARTITION BY REFERENCE (order_fk);

How do I use Partitioning?

SQL> select partition_name, high_value

2 from user_tab_partitions

3 where table_name = 'ORDER_LINES';

PARTITION_NAME HIGH_VALUE

------------------------------ --------------

Y2005

Y2006

Y2007

Y2008

YINFINITE

How do I use Partitioning?

• Orders is Partitioned by order_date

• Order_lines does not contain an

order_date field

• The reference partition ensures that the

order_lines table is also partioned by

order_date

How do I use Partitioning?

• Creating Indexes

– We already created global indexes when we

created the primary keys on the table

– Here are two local indexes:

CREATE INDEX ord_date_ind1 ON orders

(order_date) LOCAL;

CREATE INDEX extra_col_ind1 ON orders

(extra_column) LOCAL;

How do I use Partitioning?

SQL> select index_name, partition_name, high_value

2 from user_ind_partitions

3 order by index_name, partition_name;

INDEX_NAME PARTITION_ HIGH_VALUE

--------------- ---------- ----------------------------------------

EXTRA_COL_IND1 Y2005 TO_DATE(' 2006-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

EXTRA_COL_IND1 Y2006 TO_DATE(' 2007-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

EXTRA_COL_IND1 Y2007 TO_DATE(' 2008-01-01 00:00:00', 'SYYYY-M M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

EXTRA_COL_IND1 Y2008 TO_DATE(' 2009-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

EXTRA_COL_IND1 YINFINITE MAXVALUE

ORD_DATE_IND1 Y2005 TO_DATE(' 2006-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

ORD_DATE_IND1 Y2006 TO_DATE(' 2007-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

ORD_DATE_IND1 Y2007 TO_DATE(' 2008-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

ORD_DATE_IND1 Y2008 TO_DATE(' 2009-01-01 00:00:00', 'SYYYY-M

M-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

ORD_DATE_IND1 YINFINITE MAXVALUE

10 rows selected.

How do I use Partitioning?

• Two local indexes created 10 partitions

• One index partition per table partition

How do I use Partitioning?

• Partition Maintenance

– Mainly a DBA task

– You add partitions, merge partitions, drop

partitions, etc

– Remember that index maintenance can affect

table availability

– Remember that table maintenance may affect

indexes

How do I use Partitioning?

• Querying Partitions

– In addition to querying a partitioned table, you

can also query individual partitions

– Normally don't need to worry about this

– Might want to see individual partition data

when the optimizer isn't being "smart"

– Add the partition keyword and a partition

name to query an individual partition

How do I use Partitioning?

Force a partition selction:

SELECT *

FROM orders

PARTITION (y2007)

WHERE order_date BETWEEN

to_date('01/02/2007', 'DD/MM/YYYY')

AND

to_date('15/02/2007', 'DD/MM/YYYY');

How do I use Partitioning?

• Tuning

– Way out of scope for this presentation

– See the Oracle 11g Performance Tuning

Guide

• One Note: Make sure to use

DBMS_STATS and be sure you analyze

partitions and indexes

How do I use Partitioning?

• Partition Advisor

– New in 11g

– Part of the SQL Access Advisor

– Fairly complex to use

– Since it's new, it's not the place to start

Summary

Summary

• Oracle Partitioning is robust

• Oracle Partitioning is performant

• Oracle Partitioning is useful in OLTP and

OLAP databases

• Oracle Partitioning is not free

• Choosing the right Partition Key is critical

Lewis Cunningham

Senior Solutions Architect

EnterpriseDB

[email protected]

Oracle 11g Partitioning

for Developers