dynamic partitioning in informatca 8.x

32
integration * intelligence * insight Dynamic Dynamic Partitioning Partitioning

Upload: hotjobhunt

Post on 18-Nov-2014

123 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Dynamic PartitioningDynamic Partitioning

Page 2: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

High Availability

Grid Computing

Dynamic Partitioning

AGENDA

Page 3: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

IntroductionIntroduction

PowerCenter Domains

PowerCenter introduces a service-oriented architecture

PowerCenter introduces a domain, which serves as the primary unit of administration for the PowerCenter environment.

A domain is a collection of nodes and services in the PowerCenter environment.

The first time you install Informatica Services, you create a domain and add a node to the domain.

Page 4: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Administration Console Administration Console

• The Administration Console is a browser-based utility that enables you to view domain properties and perform basic domain administration tasks

• The Navigator displays the following types of objects:

• Domain. You can view one domain in the Administration Console

• Node. A node represents a machine in the domain.

• Grid. Create a grid to run the Integration Service on multiple nodes.

Page 5: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Administration ConsoleAdministration Console

Page 6: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Administration Console contd..Administration Console contd..

Page 7: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Administration Console contd..Administration Console contd..

Page 8: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

High Availability High Availability

• High availability is the PowerCenter option that eliminates a single point of failure in the PowerCenter environment

• High availability provides the following functionality:

• Resilience.

• Failover.

• Recovery.

Page 9: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

The Partitioning Option The Partitioning Option

• The Partitioning Option increases PowerCenter’s performance through parallel data processing .

• When the Integration Service runs the session, it can achieve higher performance by partitioning the pipeline and performing the extract, transformation, and load for each partition in parallel.

• Partition Types :• Database partitioning.

• Hash auto-keys.

• Hash user keys.

• Key range.

• Pass-through .

• Round-robin.

Page 10: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Configuring PartitioningConfiguring Partitioning

• Create or edit a session .

• Update partitioning information using the Partitions view on the Mapping tab of session properties.

• Add, delete, or edit partition points on the Partitions view of session properties .

Page 11: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Configuring a Partition Point Configuring a Partition Point

• You can configure the following information when you edit or add a partition point:

• Specify the partition type at the partition point.

• Add and delete partitions.

• Enter a description for each partition.

Page 12: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Hash user keysHash user keys

• The Integration Service uses a hash function to group rows of data among partitions .

• Improves the performance of the session , the hash function usually processes numerical data more quickly than string data.

• Specify a hash key for user hash key. • We have created a sample mapping when we don’t configure this

mapping(m_orders_scd3) for Partitioning then the run time comes up to 37 seconds

Page 13: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Hash user keys contd..Hash user keys contd..

• using hash user key partition the run time comes up to 22 seconds to complete the

session as shown in the below figure.

Page 14: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Key range partition Key range partition

• With key range partitioning, the Integration Service distributes rows of data based on a port.

• you define a range of values.

Page 15: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Key range partition contd..Key range partition contd..

• using key range partition the run time comes up to 33 seconds to complete the session as shown in the below figure.

Page 16: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Partition detailsPartition details

• Source/target statistics

Page 17: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Hash auto-keysHash auto-keys

• Use hash auto-keys partitioning at or before Rank, Sorter, Joiner, and unsorted Aggregator transformations.

• The Integration Service distributes rows to each partition according to group before they enter the Sorter and Aggregator

transformations .

Page 18: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Pass-Through Partition Type Pass-Through Partition Type

• In pass-through partitioning, the Integration Service processes data without redistributing rows among partitions.

• Increases data throughput , without increasing number of partitions.

Page 19: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Round-Robin Partition Type Round-Robin Partition Type

• In round-robin partitioning, the Integration Service distributes rows of data evenly to all partitions .

• The session based on this mapping reads item information from three flat files of different sizes: • Source file 1: 80,000 rows• Source file 2: 5,000 rows• Source file 3: 15,000 rows• When the Integration Service reads the source data, the first partition begins processing 80% of the

data, the second partition processes 5% of the data, and the third partition processes 15% of the data.

• To distribute the workload more evenly, set a partition point at the Filter transformation and set the partition type to round-robin. The Integration Service distributes the data so that each partition processes approximately one-third of the data.

Page 20: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Dynamic Partitioning Dynamic Partitioning

• If the volume of data grows or you add more CPUs, you might need to adjust partitioning so the session run time does not increase.

• When you use dynamic partitioning, you can configure the partition information so the Integration Service determines the number of partitions to create at run time.

• The Integration Service scales the number of session partitions at run time based on factors such as source database partitions or the number of nodes in a grid.

Page 21: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Configuring Dynamic Partitioning Configuring Dynamic Partitioning

Page 22: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Configuring Dynamic Partitioning contd..Configuring Dynamic Partitioning contd..

• Configure dynamic partitioning using one of the following methods:

• Disabled. Do not use dynamic partitioning. Defines the number of partitions on the Mapping tab.

• Based on number of partitions. Sets the partitions to a number that you define in the Number of Partitions attribute. Use the $DynamicPartitionCount session parameter, or enter a number greater than 1.

• Based on number of nodes in grid. Sets the partitions to the number of nodes in the grid running the session. If you configure this option for sessions that do not run on a grid, the session runs in one partition and logs a message in the session log.

• Based on source partitioning. Determines the number of partitions using database partition information. The number of partitions is the maximum of the number of partitions at the source.

Page 23: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Based on number of partitionsBased on number of partitions

• Edit the task , go to config object tab. Set the dynamic partition as based on number of partitions, number of partitions 3.

Page 24: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Based on number of partitions contd..Based on number of partitions contd..

• Using Dynamic partition the run time comes up to 32 seconds to complete the session as shown in the below figure.

Page 25: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Partition detailsPartition details

• Source/target statistics

Page 26: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Based on number of nodes in gridBased on number of nodes in grid

• Edit the task , go to config object tab. Set the dynamic partition as based on number of nodes in grid.

Page 27: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Based on number of nodes in grid contd..Based on number of nodes in grid contd..

• Using Dynamic partition the run time comes up to 25 seconds to complete the session as shown in the below figure.

Page 28: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Based on source partitioningBased on source partitioning

• Edit the task , go to config object tab. Set the dynamic partition as based on source partition

Page 29: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Based on source partitioning contd..Based on source partitioning contd..

• Using this option Dynamic partition the run time comes up to 20 seconds to complete the session as shown in the below figure.

Page 30: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Advantages of Dynamic Partition Advantages of Dynamic Partition

Session run time does not increase with volume of data grows or you add more CPUs.

Scales cost-effectively to handle large data volumes.

• Enhances developer productivity.

• Optimizes system performance in response to changing business requirements.

• Even though any system fails , session will be completed. ( grid computing).

Page 31: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

LIMITATIONS OF DYNAMIC PARTITION LIMITATIONS OF DYNAMIC PARTITION

• You cannot use dynamic partitioning with XML sources and targets.

• You cannot use dynamic partitioning with the Debugger.

Page 32: Dynamic Partitioning in Informatca 8.X

integration * intelligence * insight

Thanks