informatica differences between versions 7 & 8

66
1 Informatica confidential. For discussion purposes only. Informatica User Group PowerCenter : Differences Between v 7 & v 8 Mark Murray - Senior Sales Consultant October, 19 th 2006

Upload: nagaaytha

Post on 01-Dec-2015

70 views

Category:

Documents


2 download

DESCRIPTION

INFORMATICA Differences Between Versions 7 & 8INFORMATICA Differences

TRANSCRIPT

Page 1: INFORMATICA Differences Between Versions 7 & 8

1Informatica confidential. For discussion purposes only.

Informatica User GroupPowerCenter : Differences Between v 7 & v 8Mark Murray - Senior Sales Consultant October, 19th 2006

Page 2: INFORMATICA Differences Between Versions 7 & 8

2Informatica confidential. For discussion purposes only.

Goals for New Architecture

• Enterprise Deployment• Improved Service Orientation• High Availability• Grid Deployments

• Centralized Services• Administration• Logging & Auditing

• Single Point of Administration• Traditional Configuration• HA Configuration • Grid Configuration

Page 3: INFORMATICA Differences Between Versions 7 & 8

3Informatica confidential. For discussion purposes only.

What do customers want?

• High Availability and Failover was a top 10 request in the 2004 User Group surveys

• Database Pushdown Optimization was 10th out of 66 features in the 2005 Surveys

• Improved logging capabilities was 2nd out of over 60 feature requests in the 2004 surveys

• Looping support within the Designer

Page 4: INFORMATICA Differences Between Versions 7 & 8

4Informatica confidential. For discussion purposes only.

Informatica Data Integration PlatformContinually Raising the Bar

PowerCenter 8.1.1Now

PowerCenter 7Advanced Edition

Mission-CriticalEnterprise Deployment

One Product,Single Install

On-Demand Platform for the Enterprise

Hercules2007

Page 5: INFORMATICA Differences Between Versions 7 & 8

5Informatica confidential. For discussion purposes only.

6:35

3:36

“With PowerCenter continually leapfrogging on performance and scalability, we are never concerned about our ability to handle increasingly large data volumes in our data integration environment.”--- Kevin Smith, CRM Strategies Manager,

AAA Carolina

PipeliningERP ConnectivityUNICODE

PartitioningDebuggerXMLMetadata connectivity

PipeliningERP ConnectivityUNICODE

RealtimeWorkflowData quality3-tier architectureEnterprise metadata

PartitioningDebuggerXMLMetadata connectivity

PipeliningERP ConnectivityUNICODE

SOAWeb servicesGrid, 64-bitTeam developmentEnterprise securityMainframe Data Server and CDC Impact analysis

RealtimeWorkflowData quality3-tier architectureEnterprise metadata

PartitioningDebuggerXMLMetadata connectivity

PipeliningERP ConnectivityUNICODE

Session On GridAdaptive Load BalancingHigh AvailabilityDynamic PartitioningPushdown OptimizationUnstructured DataData Federation

SOAWeb ServicesGrid, 64-bitTeam developmentEnterprise securityMainframe Data Server and CDC Impact analysis

RealtimeWorkflowData quality3-tier architectureEnterprise metadata

PartitioningDebuggerXMLMetadata connectivity

PipeliningERP ConnectivityUNICODE

V4.x V5.x V6.x V7.x V8.x

1 TB Transform and Load Test HR: Min

0:37

Informatica DeliversContinuous Innovation

<18 min

Page 6: INFORMATICA Differences Between Versions 7 & 8

6Informatica confidential. For discussion purposes only.

What else is in the Informatica product family?

PowerCenter Options

Data Cleanse and Match

Data Federation (EII)

Enterprise Grid

Pushdown Optimization

High Availability

Unstructured Data

New

Mapping Generation

PowerCenter 8StandardEdition

PowerCenter 8Advanced

EditionMetadata Manager

Data AnalyzerTeam Based Development

Data Profiling

Partitioning

Real-Time

Updated

Metadata Exchange

PowerCenter ConnectsBroader

Page 7: INFORMATICA Differences Between Versions 7 & 8

7Informatica confidential. For discussion purposes only.

PowerCenter 8 Base ImprovementsDelivering Value for Installed Base Customers

Reduce Time To Results• Java transformation support• User defined functions• Extended expression library• Mapping generation and templates• Improved Data Profiling

Cost Effectively Scale• Centralized administration web-based console• Extended recovery options• Connection resilience (RDMS, Network, PC)• Flat File Performance Optimization• Enhanced, centralized logging• Enhanced Team-Based Development• Unicode repository option

PowerCenterStandardEdition

PowerCenterAdvanced

EditionMetadata Manager

Data Analyzer

Team Based Development

Page 8: INFORMATICA Differences Between Versions 7 & 8

8Informatica confidential. For discussion purposes only.

PowerCenter 8 Release Themes

• Service Oriented Architecture• 24x7 Availability of PowerCenter services• Order of magnitude performance improvements• Unlimited scalability• Improved developer productivity

Page 9: INFORMATICA Differences Between Versions 7 & 8

9Informatica confidential. For discussion purposes only.

PowerCenter 8.x Update –Setting the Standard for Data Integration across the Enterprise

• Infrastructure and Server Enhancements

• Services based Architecture• High Availability• Grid Enhancements• Easy Grid Configuration• Centralized administration web-based

console• Centralized configuration

• Developer Enhancements• Functions and Expressions• User Defined Functions• Java Transformation• Dynamic Target Creation• Visio Template – mapping generation

and templates• Upgrade Wizard

• Expand the definition of universal data access

• Data Federation Option• Unstructured Data Option• Data Quality Option –• Extended PowerExchange

• Performance Enhancements• Pushdown Optimization• Flat Files• Partitioning• Auto Cache• Connection resilience (RDMS,

Network, PC)

Page 10: INFORMATICA Differences Between Versions 7 & 8

10Informatica confidential. For discussion purposes only.

PowerCenter 8 Architecture

Page 11: INFORMATICA Differences Between Versions 7 & 8

11Informatica confidential. For discussion purposes only.

Machine

PowerCenter 6 and 7 Architecture

PowerCenterConnects

Data Servers (pmserver)

Repository Server

Client Tools

Repository DatabaseWeb Services

Hub

PowerExchange

Repository Manager

Designer

Workflow Manger

Workflow Monitor

Repository Server Admin Console

Page 12: INFORMATICA Differences Between Versions 7 & 8

12Informatica confidential. For discussion purposes only.

Node & Domain

PowerCenter 8 Architecture

PowerCenterConnects

Core Services

Client Tools

Repository Database

PowerExchange

Repository Manager

Designer

Workflow Manger

Workflow Monitor

Administration Console

Application Services

Integration Service

Web Services Hub

Repository Service

SAP BW Service

Log ServiceRepository ServiceDomain/Gateway Services• Administration & Authorization• Configuration• Domain• Licensing

*

.

Page 13: INFORMATICA Differences Between Versions 7 & 8

13Informatica confidential. For discussion purposes only.

PowerCenter 8 Terminology

• Services• A service is a resource that provides specialized functions.• PowerCenter has two types of services. Application and

Core Services.• PowerCenter Application Services – represents server based functions such

as Repository, Integration, SAP BW, and WebService Hub services.• PowerCenter Core Services – represents functions that manage and

maintain the environment in which PowerCenter operates.

Page 14: INFORMATICA Differences Between Versions 7 & 8

14Informatica confidential. For discussion purposes only.

Introducing PowerCenter 8 Terminology

• Node• A node is a logical representation of a physical machine. It has

physical attributes such as a hostname and port number.• Each node runs a Service Manager which is responsible for the

application and core services.• Is started when you start “Informatica Services”

• Domain• A domain is the fundamental unit of PowerCenter Services

administraion. • A domain is a logical collection or set of nodes and services that

you can group in a “folder like” deployment.

Page 15: INFORMATICA Differences Between Versions 7 & 8

15Informatica confidential. For discussion purposes only.

PowerCenter 8 Terminology

• Service Manager• On the gateway node, the Service Manager is responsible

for • Controlling the domain• Manage services running on the domain• Provide service lookup

• On all nodes, the Service Manager • Controls the core services and application services

Page 16: INFORMATICA Differences Between Versions 7 & 8

16Informatica confidential. For discussion purposes only.

PowerCenter Services Framework

RepositoryDatabase

Master Gateway(Domain

Controller)

RepositoryService

PowerCenterDomain

AdministrationConsole

Client Tools

DomainMetadata

Logs

Checkpoint

Integration Service

Monitor

Workflow Manager

RepositoryManager

Designer

Page 17: INFORMATICA Differences Between Versions 7 & 8

17Informatica confidential. For discussion purposes only.

High Availability (HA)

Page 18: INFORMATICA Differences Between Versions 7 & 8

18Informatica confidential. For discussion purposes only.

High Availability in PC8

• Failover• Restart for data integration, repository and other services• Primary and backup servers

• Recovery • Workflow and sessions will be recovered on running servers on

the grid during server failure• Checkpoint recovery

• Repository recovery

• Resilience• PowerCenter jobs will sustain transient failure

• Network errors• DB connection failures

Page 19: INFORMATICA Differences Between Versions 7 & 8

19Informatica confidential. For discussion purposes only.

Resilience

• DB Connection Resilience• When connecting/disconnecting from a DB• Oracle, DB2, Sybase, SQL Server and Teradata• Retry interval based on timeout setting

• FTP Resilience• For connections to FTP server• Read/write will recover if connection lost based on timeout

parameter

• Internal Resilience• PowerCenter components (integration service, clients etc.)

resilient to Repository service failure

Page 20: INFORMATICA Differences Between Versions 7 & 8

20Informatica confidential. For discussion purposes only.

Simple High Availability/Failover Scenario

• Simple environment• 1 Domain which consists of:

• 2 nodes for Integration Services • node01 - Primary• node02 - Backup

• 1 server for repository.

Node01(Int_Svc01)

Node02(Int_Svc02)

Repository DB

Page 21: INFORMATICA Differences Between Versions 7 & 8

21Informatica confidential. For discussion purposes only.

Simple High/Failover Availability Scenario

• node01 Integration Service goes down

node01(Int_Svs01)

node02(Int_Svs02)

Repository DB

• Node01 Integration Service “fails over” to node02

ComponentFailure

(HW/SW)

Automatic FailoverRestart

Recovery

Page 22: INFORMATICA Differences Between Versions 7 & 8

22Informatica confidential. For discussion purposes only.

Grid Enhancements

Page 23: INFORMATICA Differences Between Versions 7 & 8

23Informatica confidential. For discussion purposes only.

Domain Overview DashboardSimplified, Web-based Administration

Services

Domain

Nodes

Example Primary

& BackupRepository

Service

Services ConfigurationRemember pmserver config file?

Page 24: INFORMATICA Differences Between Versions 7 & 8

24Informatica confidential. For discussion purposes only.

Mission-critical Enterprise DeploymentCost-effective Scalability with PowerCenter on a Grid

Automatically recover, restart on live server

Distributed processing of sessions

PowerCenterDomain on Server Grid

FailedHardware

Server

PowerCenterDomain

Controller

Page 25: INFORMATICA Differences Between Versions 7 & 8

25Informatica confidential. For discussion purposes only.

Grid Enhancements

Grid Object• Configured from admin console• Services can be assigned to grid• Workflows are assigned to be run by services

• Workflow distributed on Grid (WOnG)• Same as version 7• Distribute Sessions of a Workflow across multiple nodes

• Session distributed on Grid (SOnG)• New in version 8• Can partition sessions to run on multiple nodes

Dynamic Partitioning• # of partitions dynamically determined at runtime• Less configuration for users

Resource Maps• Configure available resources on nodes in grid through admin console• Load balancer dispatch jobs based on resource availability on nodes

Page 26: INFORMATICA Differences Between Versions 7 & 8

26Informatica confidential. For discussion purposes only.

Grid – PC 7 vs. PC 8

PowerCenter 7• ServerGrid is collection of

pmservers

• Work is directed to individual pmservers

• Work distributed across Grid in round-robin manner

• Session/task is lowest unit of work

Page 27: INFORMATICA Differences Between Versions 7 & 8

27Informatica confidential. For discussion purposes only.

Grid Capabilities in 7.x vs. 8.x 8.X

• Grid object• Collection of nodes

• Workflows assigned to Integration Service

• Integration Service assigned to Grid (can run on any node in grid)

• If one node fails, another Integration Service process on another node in grid takes over running the workflow

• A session can be partitioned across nodes

• Load balancer takes into account resource availability on nodes and resource requirements of sessions for dispatch.

7.x• ServerGrid Object

• Collection of pmservers• Workflows explicitly assigned

to pmservers• Pmservers belonging to a

ServerGrid will dispatch to other pmservers

• Pmservers could fail causing workflows to fail

• Can’t split sessions across multiple nodes

• Load balancer is round robin only

Page 28: INFORMATICA Differences Between Versions 7 & 8

28Informatica confidential. For discussion purposes only.

Performance Improvements

Page 29: INFORMATICA Differences Between Versions 7 & 8

29Informatica confidential. For discussion purposes only.

Pushdown Optimization

Page 30: INFORMATICA Differences Between Versions 7 & 8

30Informatica confidential. For discussion purposes only.

Introduction

• What is pushdown optimization?• Push transformation processing to data sources & targets

w/o moving data out

• Benefits• Reduce movement of data when source and target are the

same database instance• Utilize database-specific processing that may be more

optimal• Maintain metadata and lineage in PowerCenter

Page 31: INFORMATICA Differences Between Versions 7 & 8

31Informatica confidential. For discussion purposes only.

Pushdown Optimization

• Full Pushdown:• Source and target are in the same RDBMS• All transformations can be processed in database

• Partial Source:• One or more transformations can be processed in source database

• Partial Target :• One or more transformations can be processed in target database

• Generated SQL: • INSERT INTO t (…) VALUES (?+1, SOUNDEX(?))

SourceDB

TargetDB

LoadTransformExtract

Page 32: INFORMATICA Differences Between Versions 7 & 8

32Informatica confidential. For discussion purposes only.

Example – Full PushdownSQL & Business Logic Maintained in Repository

Page 33: INFORMATICA Differences Between Versions 7 & 8

33Informatica confidential. For discussion purposes only.

Flat File Performance & Parameter and Variable Enhancements

Page 34: INFORMATICA Differences Between Versions 7 & 8

34Informatica confidential. For discussion purposes only.

Flat file enhancements

• FF Reader and Writer have been rewritten to optimize for performance• Delimited files with lots of decimal data will see the most

significant performance improvements• Out of box performance improvements should be between 30%-

300%

• Append to flat file targets• Session output can be appended to existing flat file

• Flat file source/target command support• Sources: use a command to generate source data or a file list

that references multiple source files. • Targets: use a command to process the target data or process

data for all partitioned targets in a session.

Page 35: INFORMATICA Differences Between Versions 7 & 8

35Informatica confidential. For discussion purposes only.

Parameters and Variables Enhancements

• Parameter Enhancements• Table owner name for relational sources/targets• E-mail address• FTP remote file name

• Global section specification in parameter files for use across different workflows / sessions

Page 36: INFORMATICA Differences Between Versions 7 & 8

36Informatica confidential. For discussion purposes only.

Partitioning Enhancements

Page 37: INFORMATICA Differences Between Versions 7 & 8

37Informatica confidential. For discussion purposes only.

Partitioning Enhancements

• Flat File Partitioning• FF targets can now be partitioned• All partitions can write to a single file, a merge file or file list can

be created that contains the names of the individual files that were written

• Database Partitioning• Partitioned Oracle and DB2 sources can be read in parallel• No changes to targets. DB2 can be written to in parallel.

• Dynamic Partitioning• Based on # of partitions in database• Based on the # of nodes in a Grid

Page 38: INFORMATICA Differences Between Versions 7 & 8

38Informatica confidential. For discussion purposes only.

Auto Cache

© Informatica Corporation, 2006. All rights reserved.

Page 39: INFORMATICA Differences Between Versions 7 & 8

39Informatica confidential. For discussion purposes only.

AutoCache Overview

• Cache in PowerCenter v7• Default cache settings not adequate for all situations.• Default settings can underestimate new chip technologies.• Sometimes necessary to hand tune individual transformations.• Development did not always scale when deployed to different

production machines.

• Auto Cache in PowerCenter v8.x• Automatically distribute session memory to transformations.• Automatically scale memory usage based on resource available.• Automatically scale memory usage based on mapping

complexity.

Page 40: INFORMATICA Differences Between Versions 7 & 8

40Informatica confidential. For discussion purposes only.

Memory Attributes

• PowerCenter has two types of memory attributes:• Transformation Memory Attributes • Session Memory Attributes

• Transformation Memory Attributes are for individual transformations:• Lookup, Aggregator, Rank, Joiner

• Index and Data Cache Size

• Sorter Cache Size• XML Target Cache Size

• Session Memory Attributes are for the session:• Default Buffer Block Size• DTM Buffer Size

Page 41: INFORMATICA Differences Between Versions 7 & 8

41Informatica confidential. For discussion purposes only.

New Memory Attribute Specification

• Previously, only integer byte value were allowed for Memory Attributes. E.g, 1000000 or 2000000.

• Now also allow shortcuts: “KB”, “MB”, and “GB”. E.g, 100MB

• Also allow the value “Auto”• This indicates that the user wants PowerCenter to automatically

find a good value for that memory attribute• “Auto” supported for both session (e.g. DTM buffers/buffer block

size) and transformation memory attributes (e.g. lookup caches)

Page 42: INFORMATICA Differences Between Versions 7 & 8

42Informatica confidential. For discussion purposes only.

AutoCache

• Allows the user to leave the calculations to PowerCenter

• User specifies total amount of memory AutoCache is allowed to use

• Automatically computes a value for ALL memory attributes that have the value “Auto”

• Will NOT affect any memory attributes where the value is not “Auto”

Page 43: INFORMATICA Differences Between Versions 7 & 8

43Informatica confidential. For discussion purposes only.

Cache Calculator

• Click drop down

• Calculate based on the number of rows and the ports going into the object

• Value is propogatedinto the Cache value

Page 44: INFORMATICA Differences Between Versions 7 & 8

44Informatica confidential. For discussion purposes only.

Developer Improvements

Page 45: INFORMATICA Differences Between Versions 7 & 8

45Informatica confidential. For discussion purposes only.

Functions and Expressions

Page 46: INFORMATICA Differences Between Versions 7 & 8

46Informatica confidential. For discussion purposes only.

Function Enhancements

• Over 20 new functions added in the 8.x release• Financial Functions, Regular Expression parsing/match,

IN(), Compression, Encryption, CRC, MD5 and more

• Custom Functions• Extend the functionality of the Expression Transformation

via a C API• All 20+ functions above were added via this API

Page 47: INFORMATICA Differences Between Versions 7 & 8

47Informatica confidential. For discussion purposes only.

Function Enhancements

• User Defined Functions (UDF)• Ability for Designer users to create reusable functions

entirely within the Expression Language• UDFs are folder level objects• can use any valid functions (except aggregation

functions) as well as other UDFs (in the same folder)

Page 48: INFORMATICA Differences Between Versions 7 & 8

48Informatica confidential. For discussion purposes only.

Java & SQL Transformations

Page 49: INFORMATICA Differences Between Versions 7 & 8

49Informatica confidential. For discussion purposes only.

Java Transformation Use Cases

• Looping over data

• Walking data hierarchies

• Calling third-party APIs (Java based)• Calling RMI/EJB etc. • Other Java Packages

• Calling expression/UDF/unconnected widget (like lookup) from Custom Transformation

• Simple “Custom Transformation”

Page 50: INFORMATICA Differences Between Versions 7 & 8

50Informatica confidential. For discussion purposes only.

Improved Developer Productivity Java Inline Coding Sample

Page 51: INFORMATICA Differences Between Versions 7 & 8

51Informatica confidential. For discussion purposes only.

SQL Transformation Use Cases

• New SQL Transformation• Allows PowerCenter developers to execute SQL

statements midstream in a mapping.• You can insert, delete, update, and retrieve rows from a

database and returns database errors.• The SQL that is executed can be static SQL or can be

dynamic where the SQL statement is itself created on a row by row basis.

• The SQL transformation can also be used to execute SQL scripts from within a mapping – e.g. leverage SQL scripts that already exist

Page 52: INFORMATICA Differences Between Versions 7 & 8

52Informatica confidential. For discussion purposes only.

XML

Page 53: INFORMATICA Differences Between Versions 7 & 8

53Informatica confidential. For discussion purposes only.

XML Enhancements

• Filter data with query predicate

• Create a default namespace

• Import part of an XML schema

• Use anySimpleType

Page 54: INFORMATICA Differences Between Versions 7 & 8

54Informatica confidential. For discussion purposes only.

Metadata Enhancements

Page 55: INFORMATICA Differences Between Versions 7 & 8

55Informatica confidential. For discussion purposes only.

Metadata Exchange Enhancements

• New Data Model Support• Sybase Power Designer – bi-directional• Oracle Designer – bi-directional• ER Studio Design Tool – uni-directional (same as before)• CA Erwin – bi-directional

• Business Intelligence Support• Business Objects (bi-directional) – added 6.5 & XI & XI R2

XConnects• Cognos ReportNet Framework Manager (bi-directional) – added

2.0• Microstrategy (bi-directional) – added 8.0

Page 56: INFORMATICA Differences Between Versions 7 & 8

56Informatica confidential. For discussion purposes only.

Dynamic Target Creation

Page 57: INFORMATICA Differences Between Versions 7 & 8

57Informatica confidential. For discussion purposes only.

Dynamic Target creation

• Ability to dynamically create a target based on a transformation in the workspace or navigator

• Right click on transformation in workspace and selected Create and Add Target

• Drag a transformation and drop it in the Target folder

• Has same port definitions as transformation from which it was created

• Target type is same as repository you are using

• Can edit the target definition to change type or ports

• Creation dialog will be added in an upcoming release

Page 58: INFORMATICA Differences Between Versions 7 & 8

58Informatica confidential. For discussion purposes only.

Improved Developer Productivity Target Generation

Simply Right-Click on an object…

…..Target is created! All you need to do is Auto link and you are ready to go

Page 59: INFORMATICA Differences Between Versions 7 & 8

59Informatica confidential. For discussion purposes only.

Mapping Generation OptionVisio Client for PowerCenter

Page 60: INFORMATICA Differences Between Versions 7 & 8

60Informatica confidential. For discussion purposes only.

Mapping Generation Option

• Bi-Directional “engine” for automatically generating mappings from Visio templates orreverse engineering PowerCenter mappings into Visio templates

• Leverages the Informatica Data Stencil and Velocity templates for Visio

Page 61: INFORMATICA Differences Between Versions 7 & 8

61Informatica confidential. For discussion purposes only.

Visio Client for PowerCenter

Mapping Template

Template Inputs

Page 62: INFORMATICA Differences Between Versions 7 & 8

62Informatica confidential. For discussion purposes only.

Upgrade Wizard

Page 63: INFORMATICA Differences Between Versions 7 & 8

63Informatica confidential. For discussion purposes only.

PowerCenter Upgrade to 8.1

• A new Upgrade wizard in Admin Console• Integrated UI that takes the user through the various steps in the

upgrade• Provides a detailed upgrade summary report in the end• Allows user to switch in and out of the Upgrade UI to perform any

other administrative activities• Can handle multiple repositories (global /local) and multiple

PowerCenter Servers in one shot• Live feedback during repository upgrade as user goes through

the upgrade process

• A new post-upgrade reference guide

Page 64: INFORMATICA Differences Between Versions 7 & 8

64Informatica confidential. For discussion purposes only.

Summary

Page 65: INFORMATICA Differences Between Versions 7 & 8

65Informatica confidential. For discussion purposes only.

Summary - PC 7 vs. PC 8

PC 7.x• 3 Tier Architecture

• Basic Grid Deployment

• Introduction to Profiling

• Added Transformations• Union• XML

• Web Services

• Team Based Development

PC 8.x• Services Oriented Architecture

• Enhanced Grid Deployment• High Availability• Session on Grid• Resilience

• Enhanced Profiling

• Added Transformations• Java• SQL

• Enhanced Productivity• Mapping Generation• User Defined Functions

Page 66: INFORMATICA Differences Between Versions 7 & 8

66Informatica confidential. For discussion purposes only.

Thank You Questions at the break