exadata, oracle data integrator and parallel data load: a real- world case study kellyn potvin, sr....

33
Exadata, Oracle Data Integrator and Parallel Data Load: A Real-World Case Study Kellyn Pot’Vin, Sr. Technical Consultant

Upload: guillermo-ford

Post on 31-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

Exadata, Oracle Data Integrator and Parallel Data

Load: A Real-World Case Study

Kellyn Pot’Vin, Sr. Technical Consultant

Page 2: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

2

Who I am

• Westminster, CO• Oracle ACE Director• Sr. Technical Consultant, Enkitec• Blog at http://dbakevlar.com• Board of Directors and Directory of RMOUG

Training Days Conference in Denver, CO.• Database Track Lead for KSCOPE 2014 in

Seattle, Wa.• Tweet @DBAKevlar• Author: Expert Enterprise Manager 12c, Pro

SQL Server 2012, Apress• WIT, (Women In Technology) advocate

Page 3: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

3

Agenda for this Session

• Discuss ODI Requirements• Discuss Exadata Features• Discuss “Old School” Limitations to

ETL design.• Solution to Create Speed in

Reporting.

Page 4: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

4

Environment

• Multiple Exadata Environment• Oracle Data Integrator is New Feature as part of

recent consolidation effort.• 15-20 consolidated databases on each

development Exadata.• 10 currently on production Exadata and

consolidating more monthly.• Monitored by EM12c• Golden Gate utilized originally for

consolidations, now commissioned to support ODI for new project.

Page 5: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

5

Why Oracle Data Integrator

Page 6: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

6

Oracle Data Integrator, (ODI)

• Enterprise platforms with its open and integrated E-LT architecture

• Simple mapping wizards, user-friendly interface.

• Integrates successfully with Enterprise Manager 12c.

• Also integrates with weblogic, SAP, APIs and other advanced features.

Page 7: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

7

ODI in Action…

*Courtesy of Oracle.com

Page 8: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

8

What is Change Data Capture, (CDC)?• Originally self-contained, then part of Streams.• Not all CDC is created, (built) the same.

• Transaction “aware”• Recoverable• Handle Data Integrity• Are Scalable• Flexible and Robust

• This is your goal, practice, (testing) makes perfect!

Page 9: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

9

Designing in Oracle Data Integrator

Page 10: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

10

CDC and/or just ODI?

• CDC is part of the Golden Gate piece.• Golden Gate offers real-time “continuous

capture and delivery” of changes to your warehouse.

• Once synchronized with the source, Golden Gate can be integrated into ODI.

• Golden Gate requires a term license from Oracle.

Page 11: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

11

Where Does Exadata Come in?• Smartscans- Offloading large table scans to

cell nodes are an impressive enhancement to performance.

• Storage Indexes- If a report offers an opportunity for Exadata to create a storage index to assist in performance, then this feature will benefit reporting.

• HCC- Compression if data is loaded appropriately, (APPEND should be used) and NO UPDATING data in compressed objects!

Page 12: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

12

Out of the Starting Gate with ODI..

Almost identical performance issues on each statement.

Red is concurrency- but what is the concurrency on?

Performance Tanked…

Page 13: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

13

Did we Offload?

SQL Monitor Confirms Easily…

So why aren’t we seeing great performance??

Page 14: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

14

From the Execution Plan Perspective

Hasn’t Finished!

Page 15: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

15

What did the SQL look like that was created through ODI?insert into TABLE1... SELECT /*+parallel(4)*/ SUM(col1) …., CASE when col3 is null then 0 else col3 end, 0, CASE WHEN col7 IS NULL THEN 0 ELSE col7 END, 0 FROM SOURCE_TBL1, SOURCE_TBL2, SOURCE_TBL3, SOURCE_TBL4 SELECT (lots of columns and sorting and grouping) from SOURCE_TBL3 and SOURCE_TBL4) where (1=1) And col5 > 0 And (col8=col12(+)) DT_col1 between DT_col2(+) AND DT_col9(+) GROUP BY col6, col4, col3, col9, DT_col1;

Page 16: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

16

Issue #1- Performance and High Temp Usage

Page 17: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

17

The Cost of Temp Waits to ODI Processing

Event Waits Time(s) (ms) Pct Type------------------------------ ------------ DB CPU 4,314 42.0direct path read temp 545,690 3,389 33.0 User I/Odirect path write temp 156,464 1,296 12.6 User I/O

Page 18: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

18

Why the Temp Usage?

• Large hash joins following offload table scans.• Summing of the data in the query.• Sorting the data in the query.• DOP, (Degree of Parallelism) set at table level

or as hint.

Data was not aggregated or stored in the format easily utilized for reporting.

Page 19: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

19

Let’s Talk About Performance, PGA and Limits…Two Types of PGA- • Non-limited, outside of Oracle’s control, often used

for PL/SQL tables, etc.• Limited by allocation per process set within Oracle

for hashing, sorting, etc.

• Depending on release, process type, etc., there are approximate percentages that can be expected for limits of PGA allocated to sort and hash processing.

Page 20: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

20

When You Work Outside of PGA..

TempfileProcesses Writing to Temp

Processes Reading from Temp

Page 21: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

21

Types of PGA Processing

Optimal- All fits within the PGA allocation per process.

Single Pass- Written once to temp tablespace.

Multi-Pass- Writte multiple times to temp tablesapace to achive results, (least desirable)

Page 22: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

22

Disk is Slow…

What is “* Cache Hit”?

When something does not “fit”, we know it’s gone back to disk to perform the task and disk is slow.

If PGA allocation is surpassed, the processing is then performed in the Temp Tablespace.

Where do the tempfiles for the Temp Tablespace reside?Temp is disk, disk is SLOW!

Page 23: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

23

Tuning these LimitsIncreasing PGA will of course increase the percentage per process for some types of sorting and hashing, but there is still a limit.

Parallel can offer some assistance by spreading the PGA load across multiple processes- but there is a cost to “knitting” the results back together.

What you should not do:• Do not start playing with “_pga_max_size",

"_smm_max_size" and "_smm_px_max_size"

Page 24: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

24

Stop Using Temp

Sorting and Hashing should be done as much as possible within PGA.

Limit any “swapping” to temp.

Ensure you are viewing temp usage in your explain/execution plans!

Page 25: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

25

How Do You Know?

AWR Report for large timeline of snapshots is Good Place to Start…

Page 26: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

26

Searching SQL History with EM12c

Page 27: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

27

Issue #2- So What is Causing our Changes now?11.2.0.2, PSU Jan 2013+ dynamic sampling initiation changes for Parallel.

AWR SQL specific report, (awrsqrpt.sql) shows that Dynamic Sampling did occur.

Page 28: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

28

First Step: Consistent Behavior

• Statistics verification at each step of data load process.

• Parallel is needed- so how do we address dynamic sampling?• Alter system, set dynamic_sampling=0• ODI allows hints to be introduced at

universe- add hint, “dynamic_sampling(0)”• Parallel controlled to verify consistent DOP.

Page 29: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

29

Performance Stable- We’ve Got it ALL COVERED!

• Identified high resource use, repeated sorts & sums.

• Created objects to limit amount that has to be performed in PGA by introducing:• Rollup tables• Materialized Views with single, night time

refresh, post ETL load.• Indexing

Page 30: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

30

Elapsed Time, Improvements

Page 31: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

31

Summary

• Identify WHAT is consuming time.• Understand the limits on PGA/Temp

usage- Repeatedly the biggest hurdle in ETL projects.

• Understand that TEMP is disk and disk is SLOW, even on Exadata.

• Identify what data you repeatedly are aggregating and create objects to support reporting.

Page 32: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant
Page 33: Exadata, Oracle Data Integrator and Parallel Data Load: A Real- World Case Study Kellyn PotVin, Sr. Technical Consultant

33

Questions?

Fastest Growing Companies in Dallas