informatica training

160
Informatica

Upload: sunilchopsey

Post on 17-Dec-2015

33 views

Category:

Documents


0 download

DESCRIPTION

sdsd

TRANSCRIPT

  • Informatica

    Center Of Excellence-Data Warehousing

  • AgendaOverview & ComponentsInformatica Server & Data MovementRepository Server & Repository ManagerDesignerTransformations used in InformaticaRe-usable Transformations & MappletsWorkflow Manager & Workflow MonitorPerformance Tuning & Troubleshooting

    Center Of Excellence-Data Warehousing

  • Overview & Components

    Center Of Excellence-Data Warehousing

  • Informatica Power Center Architecture

    Center Of Excellence-Data Warehousing

  • PowerCenter 8x ArchitecturePackaged ApplicationsStandards, Messaging, Web ServicesPackaged ApplicationsRelational/Flat FilesMainframe/ MidrangeSourcesTargetsIntegration Service(s)PowerCenterConnectsPowerExchangeCore ServicesApplication ServicesClient ToolsSAP BW ServiceWeb Services HubRepository Service(s)Domain/Gateway ServiceAuthenticationConfigurationService management-

    Center Of Excellence-Data Warehousing

  • What is a Domain?Unified and single point of admin/config for:Integration Service (Informatica Server)Repository Service (Repository Server)Web Services Hub Service (WSH)BW Integration Service (BW Integration )Domain Consists ofSet of NodesSet of ServicesZero or more GridsSet of Resources

    Center Of Excellence-Data Warehousing

  • Gateway (Domain Controller) NodePurpose of the Gateway nodeStarts up and manages services running on the domainManages Configuration MetadataProvides Service lookup for clientsChecks for service availability via heartbeatsCoordinates failover of services HA for Gateway NodeOne or more nodes can be designated as Gateway nodesOnly one Master gateway node active at a timeElection process determines new Master

    Center Of Excellence-Data Warehousing

  • ServicesApplication ServiceService that is configured by the end user and represents a key visible component (Integration Service, Repository Service etc)External clients directly interact with these servicesCore ServiceInfrastructure (internal) service (Gateway Service, Logging Service etc)

    Center Of Excellence-Data Warehousing

  • HA SetupOne Primary node and list of Backup nodes (Active/Passive mode)Application ServicesCore ServicesAutomatic failover from Primary to BackupNo automatic fail-backManual fail-backIntegration Service operates in Active-Active mode

    Center Of Excellence-Data Warehousing

  • Overview .. Informatica RepositoryStores the metadata created using the Informatica Client toolsRepository Manager creates the metadata tables in the databaseTasks in the Informatica Client application such as creating users, analyzing sources, developing mappings or mapplets, or creating sessions creates metadataInformatica Server reads metadata created in the Client application when a session runsGlobal and local repositories can be created to share metadata

    Center Of Excellence-Data Warehousing

  • GridCollection on nodesIntegration Service can be assigned to GridService runs on all nodes in the gridGrid leveraged for Workflow distribution and Session distribution (SonG)ScalabilityAvailabilityAdvanced Load BalancerResource map

    Center Of Excellence-Data Warehousing

  • PowerCenter Architecture:Data FlowUNIX (AIX, HPUX, Solaris, Tru64) Windows NT 4.0, 2000Oracle MS SQL Server Sybase DB2 UDB Client Tools Windows 95, 98, NT 4.0 or 2000ObjectRepositoryRepository ServiceSourcesTargets

    Center Of Excellence-Data Warehousing

  • Overview .. Informatica Client ToolsRepository Manager To create and administer the metadata repositoryTo create repository users and groups, assign privileges and permissionsManage folders and locksDesigner To add source and target definitions to the repositoryTo create mappings that contain data transformation instructionsWorkflow Manager & Workflow MonitorTo create, schedule, execute, and monitor sessions

    Center Of Excellence-Data Warehousing

  • Overview .. Informatica ServerThe Informatica Server reads mapping and session information from the repositoryIt extracts data from the mapping sources and stores the data in memory while it applies the transformation rules in the mappingThe Informatica Server loads the transformed data into the mapping targetsPlatformsWindows NT/2000 UNIX

    Center Of Excellence-Data Warehousing

  • Overview .. SourcesRelational - Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and TeradataFile - Fixed and delimited flat file, COBOL file, and XMLExtended PowerConnect products for PeopleSoft, SAP R/3, Siebel, and IBM MQSeriesMainframe PowerConnect for IBM DB2 on MVSOther - Microsoft Excel and Access

    Center Of Excellence-Data Warehousing

  • Overview .. TargetsRelational - Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL Server, and TeradataFile - Fixed and delimited flat files and XML Extended Integration server to load data into SAP BW. PowerConnect for IBM MQSeries to load data into IBM MQSeries message queuesOther - Microsoft AccessODBC or native drivers, FTP, or external loaders

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Informatica Server & Data Movement

    Center Of Excellence-Data Warehousing

  • Informatica Server and Data MovementThe Informatica Server moves data from sources to targets based on mapping and session metadata stored in a repository databaseA session is a set of instructions that describes how and when to move data from sources to targetsWorkflow Manager creates and manages and executes sessions,worklets and workflows.Workflow Monitor is used to monitor session for debugging in case of any error

    Center Of Excellence-Data Warehousing

  • Informatica ServerWhen a session starts, the Informatica Server retrieves mapping and session metadata from the repository database through Repository Server initiating a Repository Agent

    The Informatica Server runs as a daemon on UNIX and as a service on Windows NT/2000

    The Informatica Server uses the following processes to run a session: The Load Manager process - Starts the session, creates the DTM process, and sends post-session email when the session completesThe DTM process - Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations

    Center Of Excellence-Data Warehousing

  • The Load Manager Process The Load Manager performs the following tasks: Manages session,worklet and workflow schedulingLocks the session and reads session propertiesReads the parameter fileExpands the server and session variables and parametersVerifies permissions and privilegesValidates source and target code pagesCreates the session log fileCreates the Data Transformation Manager (DTM) process, which executes the session

    Center Of Excellence-Data Warehousing

  • The Load Manager ProcessThe Load Manager and repository communicate with each other using Unicode

    To prevent loss of information during data transfer, the Informatica Server and repository require compatible code pages

    It communicates with the repository in the following situations: When you start the Informatica ServerWhen you configure a sessionWhen a session starts

    Center Of Excellence-Data Warehousing

  • Data Transformation Manager ProcessDTM process is the second process associated with a session runThe primary purpose of the DTM process is to create and manage threads that carry out the session tasksThe DTM allocates process memory for the session and divides it into buffers. This is also known as buffer memoryIt creates the main thread, which is called the master threadThe master thread creates and manages all other threadsIf you partition a session, the DTM creates a set of threads for each partition to allow concurrent processingWhen the Informatica Server writes messages to the session log, it includes the thread type and thread ID

    Center Of Excellence-Data Warehousing

  • DTM Threads

    Center Of Excellence-Data Warehousing

    Thread Type

    Description

    Master Thread

    Main thread of the DTM process. Creates and manages all other threads. Handles stop and abort requests from the Load Manager.

    Mapping Thread

    One thread for each session. Fetches session and mapping information. Compiles the mapping. Cleans up after session execution.

    Pre- and Post-Session Threads

    One thread each to perform pre- and post-session operations.

    Reader Thread

    One thread for each partition for each source pipeline. Reads sources. Relational sources use relational threads, and file sources use file threads.

    Writer Thread

    One thread for each partition, if a target exists in the source pipeline. Writes to targets.

    Transformation Thread

    One or more transformation threads for each partition.

  • DTM ThreadsFor example, a pipeline contains one source, one target. You configure two partitions in the session properties. The DTM creates the following threads to process the pipeline: Two reader threads - One for each partition. Two writer threads - One for each partition

    When the pipeline contains an Aggregator or Rank transformation, the DTM creates one additional set of threads for each Aggregator or Rank transformation

    Center Of Excellence-Data Warehousing

  • DTM ThreadsWhen the Informatica Server processes a mapping with a Joiner transformation, it first reads the master source and builds caches based on the master rowsThe Informatica Server then reads the detail source and processes the transformation based on the detail source data and the cache dataThe pipeline for the master source ends at the Joiner transformation and may not have any targetsYou cannot partition the master source for a Joiner transformation

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Repository Server

    Center Of Excellence-Data Warehousing

  • Repository ServerInformatica client applications and informatica server access the repository database tables through the Repository Server.

    Informatica client connects to the repository server through the host name/IP Address and its port number

    The Repository Server can manage multiple repositories on different machines on the network.

    Center Of Excellence-Data Warehousing

  • Repository Server (Contd..)For each repository database registered with the Repository Server, it configures and manages a Repository Agent process.

    The Repository Agent is a multi-threaded process that performs the action needed to retrieve, insert, and update metadata in the repository database tables.

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Repository Manager

    Center Of Excellence-Data Warehousing

  • Repository Manager Window

    Center Of Excellence-Data Warehousing

  • RepositoryThe Informatica repository tables have an open architecture

    Metadata can include information such as mappings describing how to transform source datasessions indicating when you want the Informatica Server to perform the transformationsconnect strings for sources and targets

    The repository also stores administrative information such as usernames and passwordspermissions and privileges

    Center Of Excellence-Data Warehousing

  • RepositoryCan create and store the following types of metadata in the repository: Database connectionsGlobal objects MappingsMappletsMulti-dimensional metadataReusable transformationsSessions and batchesShortcutsSource definitionsTarget definitionsTransformations

    Center Of Excellence-Data Warehousing

  • Repository Types

    There are three different types of repositories: Standalone repositoryGlobal repositoryLocal repository

    Center Of Excellence-Data Warehousing

  • Repository Manager TasksPerform repository functions: Create, backup, copy, restore, upgrade, and delete repositoriesMake a repository as global repositoryRegister and unregister local repositories with a global repository

    Implement repository security:Create, edit, and delete repository users and user groupsAssign and revoke repository privileges and folder permissions View locks and unlock objects

    Center Of Excellence-Data Warehousing

  • Repository Manager TasksPerform folder functions: Create, edit, and delete foldersCopy a folder within the repository or to another repositoryCompare folders within a repository or in different repositoriesAdd and remove repository reportsImport and export repository connection information in the registryAnalyze source/target, mapping, and shortcut dependencies

    Center Of Excellence-Data Warehousing

  • Dependency Window The Dependency window can display the following types of dependencies:

    Source-target dependencies - lists all sources or targets related to the selected object and relevant information

    Mapping dependencies - lists all mappings containing the selected object as well as relevant information

    Shortcut dependencies - lists all shortcuts to the selected object and relevant details

    Center Of Excellence-Data Warehousing

  • Copying and Backing Up a RepositoryCan copy a repository from one database to anotherIf the database into which the repository has to be copied contains an existing repository, the Repository Manager deletes the existing repositoryBacking up a repository, saves the entire repository in a fileCan save this file in any local directoryCan recover data from a repository backup file

    Center Of Excellence-Data Warehousing

  • Crystal Reports The Repository Manager includes four Crystal Reports that provide views of your metadata:

    Mapping report (map.rpt) - Lists source column and transformation details for each mapping

    Source and target dependencies report (S2t_dep.rpt)

    Target table report (Trg_tbl.rpt) - Provides target field transformation expressions, descriptions, comments for each target table

    Executed session report (sessions.rpt) - Provides information about executed sessions

    Center Of Excellence-Data Warehousing

  • Repository SecurityCan plan and implement security using the following features: User groupsRepository usersRepository privilegesFolder permissionsLocking

    Can assign users to multiple groups

    Privileges are assigned to groups

    Can assign privileges to individual usernames and must assign each user to at least one user group

    Center Of Excellence-Data Warehousing

  • Repository Security Viewing LocksCan view existing locks in the repository in the Repository ManagerThe Repository Manager provides two ways to view locks: Show locks

    Center Of Excellence-Data Warehousing

  • Types of LocksThere are five kinds of locks on repository objects: Read lock - Created when you open a repository object in a folder for which you do not have write permission

    Write lock - Created when you create or edit a repository object

    Execute lock - Created when you start a session or batch

    Fetch lock - Created when the repository reads information about repository objects from the database

    Save lock

    Center Of Excellence-Data Warehousing

  • FoldersFolders provide a way to organize and store all metadata in the repository, including mappings and sessions

    They are used to store sources, transformations, cubes, dimensions, mapplets, business components, targets, mappings, sessions and batches

    Can copy objects from one folder to another

    Can copy objects across repositories

    The Designer allows you to create multiple versions within a folder

    Center Of Excellence-Data Warehousing

  • Folders

    When a new version is created, the Designer creates a copy of all existing mapping metadata in the folder and places it into the new version

    Can copy a session within a folder, but you cannot copy an individual session to a different folder

    To copy all sessions within a folder to a different location, you can copy the entire folder

    Center Of Excellence-Data Warehousing

  • FoldersAny mapping in a folder can use only those source and target definitions or reusable transformations that are stored:in the same folderin a shared folder and accessed through a shortcut

    The configurable folder properties are:Folder permissions Folder owner Owners group Shared or not shared

    Center Of Excellence-Data Warehousing

  • FoldersFolders have the following permission types: Read permission Write permissionExecute permission

    Shared folders allow users to create shortcuts to objects in the folder

    Shortcuts inherit changes to their shared object

    Once you make a folder shared, you cannot reverse it

    Center Of Excellence-Data Warehousing

  • Copying FoldersEach time you copy a folder, the Repository Manager copies the following: Sources, transformations, mapplets, targets, mappings, and business components Sessions and batches Folder versions

    Center Of Excellence-Data Warehousing

  • Copying FoldersWhen you copy a folder, the Repository Manager allows to: Re-establish shortcutsChoose an Informatica ServerCopy connectionsCopy persisted valuesCompare foldersReplace folders

    Center Of Excellence-Data Warehousing

  • Comparing FoldersThe Compare Folders Wizard allows to perform the following comparisons: Compare objects between two folders in the same repositoryCompare objects between two folders in different repositoriesCompare objects between two folder versions in the same folder

    Each comparison also allows to specify the following comparison criteria:Versions to compareObject types to compareDirection of comparison

    Center Of Excellence-Data Warehousing

  • Comparing FoldersWhether or not the Repository Manger notes a similarity or difference between two folders depends on the direction of the comparisonOne-way comparisons check the selected objects of Folder1 against the objects in Folder2Two-way comparisons check objects in Folder1 against those in Folder2 and also check objects in Folder2 against those in Folder1

    Center Of Excellence-Data Warehousing

  • Comparing FoldersThe comparison wizard displays the following user-customized information: Similarities between objects Differences between objects Outdated objectsCan edit and save the result of the comparisonThe Repository Manager does not compare the field attributes of the objects in the folders when performing the comparisonA two-way comparison can sometimes reveal information a one-way comparison cannotA one-way comparison does not note a difference if an object is present in the target folder but not in the source folder

    Center Of Excellence-Data Warehousing

  • Folder VersionsMaintaining different versions lets you revert to earlier work when needed

    When you save a version, you save all metadata at a particular point in development

    Later versions contain new or modified metadata, reflecting work that you have completed since the last version

    Maintaining different versions lets you revert to earlier work when needed

    Center Of Excellence-Data Warehousing

  • Exporting and Importing Objects In the Designer and Workflow Manager, you can export repository objects to an XML file and then import repository objects from the XML file

    Can export the following repository objects: Sources Targets Transformations Mapplets Mappings Sessions

    Can share objects by exporting and importing objects between repositories with the same version

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Designer

    Center Of Excellence-Data Warehousing

  • Screen Shot of Designer

    Center Of Excellence-Data Warehousing

  • Designer WorkspaceNavigatorWorkspaceStatus barOutput

    Center Of Excellence-Data Warehousing

  • Designer ToolsSource AnalyzerTo import or create source definitions for flat file, XML, Cobol, ERP, and relational sourcesWarehouse Designer To import or create target definitionsTransformation Developer To create reusable transformationsMapplet Designer To create mappletsMapping Designer To create mappings

    Center Of Excellence-Data Warehousing

  • Source AnalyzerThe following types of source definitions can be imported or created or modified in the Source Analyzer: Relational Sources Tables, Views, SynonymsFiles Fixed-Width or Delimited Flat Files, COBOL Files Microsoft Excel SourcesXML Sources XML Files, DTD Files, XML Schema FilesData models using MX Data Model PowerPlugSAP R/3, SAP BW, Siebel, IBM MQ Series by using PowerConnect

    Center Of Excellence-Data Warehousing

  • Source Analyzer Importing Relational Source DefinitionsAfter importing a relational source definition, Business names for the table and columns can be entered

    Center Of Excellence-Data Warehousing

  • Source Analyzer Importing Relational Source DefinitionsThe source definition appears in the Source Analyzer. In the Navigator, the new source definition appears in the Sources node of the active repository folder, under the source database name

    Center Of Excellence-Data Warehousing

  • Source Analyzer Flat File SourcesSupports Delimited & Fixed length filesFlat File Wizard prompts for the following file propertiesFile name and location File code page File type Column names and data types Number of header rows in the file Column size and null characters for fixed-width files Delimiter type, quote character, and escape character for delimited files

    Center Of Excellence-Data Warehousing

  • Source Analyzer Flat File SourcesFlat file properties in the Source Analyzer : Table name, business purpose, owner, and description File type Null characters for fixed-width files Delimiter type, quote character, and escape character for delimited files Column names and datatypes Comments HTML links to business documentation

    Center Of Excellence-Data Warehousing

  • Warehouse DesignerTo create target definitions for file and relational sourcesImport the definition for an existing target - Import the target definition from a relational targetCreate a target definition based on a source definition:Relational source definition Flat file source definition Manually create a target definition or design several related targets at the same time

    Center Of Excellence-Data Warehousing

  • Warehouse Designer - TasksEdit target definitions Change in the target definitions gets propagated to the mappings using that targetCreate relational tables in the target database If the target tables do not exist in the target database, generate and execute the necessary SQL code to create the target tablesPreview relational target data

    Center Of Excellence-Data Warehousing

  • Warehouse Designer Create/Edit Target DefinitionsCan edit Business Names, Constraints, Creation Options, Description, Keywords on the Table tab of the target definitionCan edit Column Name, Datatype, Precision and Scale, Not Null, Key Type, Business Name on the Columns tab of the target definition

    Center Of Excellence-Data Warehousing

  • MappingMappings represent the data flow between sources and targets

    When the Informatica Server runs a session, it uses the instructions configured in the mapping to read, transform, and write data

    Every mapping must contain the following components: Source definition TransformationConnectors

    A mapping can also contain one or more mapplets

    Center Of Excellence-Data Warehousing

  • Mapping Sample Mapping

    Center Of Excellence-Data Warehousing

  • Mapping - InvalidationOn editing a mapping, the Designer invalidates sessions under the following circumstances: Add or remove sources or targetsRemove mapplets or transformationsReplace a source, target, mapplet, or transformation while importing or copying objectsAdd or remove Source Qualifiers or COBOL Normalizers, or change the list of associated sources for these transformationsAdd or remove a Joiner or Update Strategy transformation. Add or remove transformations from a mapplet in the mappingChange the database type for a source

    Center Of Excellence-Data Warehousing

  • Mapping - ComponentsEvery mapping requires at least one transformation object that determines how the Informatica Server reads the source data: Source Qualifier transformationNormalizer transformation ERP Source Qualifier transformation XML Source Qualifier transformation

    Transformations can be created to use once in a mapping, or reusable transformations to use in multiple mappings

    Center Of Excellence-Data Warehousing

  • Mapping - UpdatesBy default, the Informatica Server updates targets based on key valuesThe default UPDATE statement for each target in a mapping can be overrodeFor a mapping without an Update Strategy transformation, configure the session to mark source records as update

    Center Of Excellence-Data Warehousing

  • Mapping - ValidationThe Designer marks a mapping valid for the following reasons: Connection validation - Required ports are connected and that all connections are validExpression validation - All expressions are valid Object validation - The independent object definition matches the instance in the mapping

    The Designer performs connection validation each time you connect ports in a mapping and each time you validate or save a mapping

    You can validate an expression in a transformation while you are developing a mapping

    Center Of Excellence-Data Warehousing

  • Mapping - Validation

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Transformations used in Informatica

    Center Of Excellence-Data Warehousing

  • TransformationsA transformation is a repository object that generates, modifies, or passes data

    The Designer provides a set of transformations that perform specific functions

    Transformations in a mapping represent the operations the Informatica Server performs on data

    Data passes into and out of transformations through ports that you connect in a mapping or mapplet

    Transformations can be active or passive

    Center Of Excellence-Data Warehousing

  • Transformation TypesAn active transformation can change the number of rows that pass through it

    A passive transformation does not change the number of rows that pass through it

    Transformations can be connected to the data flow, or they can be unconnected

    An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation

    Center Of Excellence-Data Warehousing

  • Active Transformation NodesAdvanced External Procedure - Calls a procedure in a shared library or in the COM layer of Windows NTAggregator - Performs aggregate calculationsERP Source Qualifier - Represents the rows that the Informatica Server reads from an ERP source when it runs a sessionFilter - Filters recordsJoiner - Joins records from different databases or flat file systemsRank - Limits records to a top or bottom rangeRouter - Routes data into multiple transformations based on a group expressionSource Qualifier - Represents the rows that the Informatica Server reads from a relational or flat file source when it runs a sessionUpdate Strategy - Determines whether to insert, delete, update, or reject records

    Center Of Excellence-Data Warehousing

  • Passive Transformation nodesExpression - Calculates a valueExternal Procedure - Calls a procedure in a shared library or in the COM layer of Windows NTInput - Defines mapplet input rows. Available only in the Mapplet DesignerLookup - Looks up valuesOutput - Defines mapplet output rows. Available only in the Mapplet DesignerSequence Generator - Generates primary keysStored Procedure - Calls a stored procedureXML Source Qualifier - Represents the rows that the Informatica Server reads from an XML source when it runs a session

    Center Of Excellence-Data Warehousing

  • Transformations - PropertiesPort NameCopied ports will inherit the name of contributing portCopied ports with the same name will be appended with a number

    Data typesTransformations use internal data typesData types of input ports must be compatible with data types of the feeding output port

    Port Default values - can be set to handle nulls and errors

    Description - can enter port comments

    Center Of Excellence-Data Warehousing

  • Aggregator Transformation Performs aggregate calculations

    Components of the Aggregator Transformation Aggregate expressionGroup by portSorted Input optionAggregate cache

    The Aggregator is an active and connected transformation

    Center Of Excellence-Data Warehousing

  • Aggregator TransformationThe following aggregate functions can be used within an Aggregator transformation:AVG, COUNT, FIRST, LAST, MAX , MEDIAN MIN, PERCENTILE, STDDEV, SUM, VARIANCE

    Center Of Excellence-Data Warehousing

  • Expression TransformationCan use the Expression transformation to perform any non-aggregate calculationsCalculate values in a single rowtest conditional statements before you output the results to target tables or other transformations

    Ports that must be included in an Expression Transformation:Input or input/output ports for each value used in the calculationOutput port for the expression

    Center Of Excellence-Data Warehousing

  • Expression TransformationCan enter multiple expressions in a single expression transformationCan enter only one expression for each output portcan create any number of output ports in the transformation

    Center Of Excellence-Data Warehousing

  • Filter TransformationIt provides the means for filtering rows in a mappingAll ports in a Filter transformation are input/outputOnly rows that meet the condition pass through itCannot concatenate ports from more than one transformation into the Filter transformationTo maximize session performance, include the Filter transformation as close to the sources in the mapping as possibleDoes not allow setting output default values

    Center Of Excellence-Data Warehousing

  • Joiner TransformationJoins two related heterogeneous sources residing in different locations or file systemsCan be used to joinTwo relational tables existing in separate databases Two flat files in potentially different file systems Two different ODBC sources Two instances of the same XML source A relational table and a flat file source A relational table and an XML source

    Center Of Excellence-Data Warehousing

  • Joiner TransformationUse the Joiner transformation to join two sources with at least one matching port

    It uses a condition that matches one or more pairs of ports between the two sources

    Requires two input transformations from two separate data flows

    It supports the following join typesNormal (Default) Master Outer Detail Outer Full Outer

    Center Of Excellence-Data Warehousing

  • Lookup TransformationUsed to look up data in a relational table, view, or synonym

    The Informatica Server queries the lookup table based on the lookup ports in the transformation

    It compares Lookup transformation port values to lookup table column values based on the lookup condition

    Can use the Lookup transformation to perform many tasks, including: Get a related valuePerform a calculationUpdate slowly changing dimension tables

    Center Of Excellence-Data Warehousing

  • Connected & Unconnected LookupConnected Lookup TransformationReceives input values directly from another transformation in the pipelineFor each input row, the Informatica Server queries the lookup table or cache based on the lookup ports and the condition in the transformationPasses return values from the query to the next transformationUnconnected Lookup Transformation Receives input values from an expression using the :LKP (:LKP.lookup_transformation_name(argument, argument, ...)) reference qualifier to call the lookup and returns one value.Some common uses for unconnected lookups include: Testing the results of a lookup in an expression Filtering records based on the lookup results Marking records for update based on the result of a lookup (for example, updating slowly changing dimension tables) Calling the same lookup multiple times in one mapping

    Center Of Excellence-Data Warehousing

  • Lookup TransformationWith unconnected Lookups, you can pass multiple input values into the transformation, but only one column of data out of the transformationUse the return port to specify the return value in an unconnected lookup transformation

    Center Of Excellence-Data Warehousing

  • Lookup Caching

    Session performance can be improved by caching the lookup tableCaching can be static or dynamicBy default, the lookup cache remains static and does not change during the sessionCaching can be persistent Cache used across sessions

    Center Of Excellence-Data Warehousing

  • Router TransformationA Router transformation tests data for one or more conditions and gives the option to route rows of data that do not meet any of the conditions to a default output group

    It has the following types of groups: Input Output

    There are two types of output groups: User-defined groups Default group

    Create one user-defined group for each condition that you want to specify

    Center Of Excellence-Data Warehousing

  • Comparing Router & Filter Transformations

    Center Of Excellence-Data Warehousing

  • Sequence Generator Transformation Generates numeric valuesIt can be used tocreate unique primary key valuesreplace missing primary keyscycle through a sequential range of numbersIt provides two output ports: NEXTVAL and CURRVALThese ports can not be edited or deletedCan not add ports to the sequence generator transformationWhen NEXTVAL is connected to the input port of another transformation, the Informatica Server generates a sequence of numbers

    Center Of Excellence-Data Warehousing

  • Sequence Generator TransformationConnect the NEXTVAL port to a downstream transformation to generate the sequence based on the Current Value and Increment By propertiesThe CURRVAL port is connected, only when the NEXTVAL port is already connected to a downstream transformation

    Center Of Excellence-Data Warehousing

  • Source Qualifier TransformationThe Source Qualifier represents the records that the Informatica Server reads when it runs a sessionCan use the Source Qualifier to perform the following tasks: Join data originating from the same source databaseFilter records when the Informatica Server reads source dataSpecify an outer join rather than the default inner joinSpecify sorted portsSelect only distinct values from the sourceCreate a custom query to issue a special SELECT statement for the Informatica Server to read source data

    Center Of Excellence-Data Warehousing

  • Source Qualifier TransformationFor relational sources, the Informatica Server generates a query for each Source Qualifier when it runs a sessionThe default query is a SELECT statement for each source column used in the mappingThe Informatica Server reads only those columns in Source Qualifier that are connected to another transformation

    Center Of Excellence-Data Warehousing

  • Update Strategy Transformation It determines whether to insert, update, delete or reject recordsConstants for each database operation

    OperationConstantNumeric ValueInsertDD_INSERT0UpdateDD_UPDATE1DeleteDD_DELETE2RejectDD_REJECT3

    Center Of Excellence-Data Warehousing

  • Rank TransformationAllows to select only the top or bottom rank of data, not just one value

    Can use it to return the largest or smallest numeric value in a port or groupthe strings at the top or the bottom of a session sort order

    During the session, the Informatica Server caches input data until it can perform the rank calculations

    Can select only one port to define a rank

    Center Of Excellence-Data Warehousing

  • Rank TransformationWhen you create a Rank transformation, you can configure the following properties: Enter a cache directorySelect the top or bottom rankSelect the input/output port that contains values used to determine the rank. You can select only one port to define a rankSelect the number of rows falling within a rankDefine groups for ranks

    Center Of Excellence-Data Warehousing

  • Rank TransformationRank Transformation Ports:Variable port - Can use to store values or calculations to use in an expression

    Rank port - Use to designate the column for which you want to rank values

    Center Of Excellence-Data Warehousing

  • Stored Procedure TransformationA Stored Procedure transformation is an important tool for populating and maintaining databasesa precompiled collection of Transact-SQL statements and optional flow control statements, similar to an executable scriptused to call a stored procedure

    The stored procedure must exist in the database before creating a Stored Procedure transformation

    One of the most useful features of stored procedures is the ability to send data to the stored procedure, and receive data from the stored procedure

    Center Of Excellence-Data Warehousing

  • Stored Procedure TransformationThere are three types of data that pass between the Informatica Server and the stored procedure:Input/Output parameters - For many stored procedures, you provide a value and receive a value in return

    Return values - Most databases provide a return value after running a stored procedure

    Status codes - Status codes provide error handling for the Informatica Server during a session

    Center Of Excellence-Data Warehousing

  • Stored Procedure TransformationThe following list describes the options for running a Stored Procedure transformation: Normal - During a session, the stored procedure runs where the transformation exists in the mapping on a row-by-row basisPre-load of the Source - Before the session retrieves data from the source, the stored procedure runsPost-load of the Source - After the session retrieves data from the source, the stored procedure runsPre-load of the Target - Before the session sends data to the target, the stored procedure runsPost-load of the Target - After the session sends data to the target, the stored procedure runs

    Center Of Excellence-Data Warehousing

  • Stored Procedure TransformationCan set up the Stored Procedure transformation in one of two modes, either connected or unconnectedThe flow of data through a mapping in connected mode also passes through the Stored Procedure transformationCannot run the same instance of a Stored Procedure transformation in both connected and unconnected mode in a mapping. You must create different instances of the transformation

    Center Of Excellence-Data Warehousing

  • Stored Procedure TransformationThe unconnected Stored Procedure transformation is not connected directly to the flow of the mappingIt either runs before or after the session, or is called by an expression in another transformation in the mapping

    Center Of Excellence-Data Warehousing

  • Dynamic Lookup TransformationA Lookup transformation using a dynamic cache has the following properties that a Lookup transformation using a static cache does not have: NewLookupRowAssociated Port

    Center Of Excellence-Data Warehousing

  • Dynamic Lookup TransformationYou might want to configure the transformation to use a dynamic cache when the target table is also the lookup table. When you use a dynamic cache, the Informatica Server inserts rows into the cache as it passes rows to the target.

    Center Of Excellence-Data Warehousing

  • Transformation LanguageThe designer provides a transformation language to help you write expressions to transform source data

    With the transformation language, you can create a transformation expression that takes the data from a port and changes it

    Can write expressions in the following transformations: Aggregator Expression Filter Rank Router Update Strategy

    Center Of Excellence-Data Warehousing

  • Transformation LanguageExpressions can consist of any combination of the following components: Ports (input, input/output, variable) String literals, numeric literals Constants Functions Local and system variables Mapping parameters and mapping variables Operators Return values

    Center Of Excellence-Data Warehousing

  • Transformation LanguageThe functions available in PowerCenter areAggregate Functions e.g. AVG, MIN, MAXCharacter Functions e.g. CONCAT, LENGTHConversion Functions e.g. TO_CHAR, TO_DATEDate Functions e.g. DATE_DIFF, LAST_DAYNumeric Functions e.g. ABS, CEIL, LOGScientific Functions e.g. COS, SINHSpecial Functions e.g. DECODE, IIF, ABORTTest Functions e.g. ISNULL, IS_DATEVariable Functions e.g. SETMAXVARIABLE

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Re usable Transformations and Mapplets

    Center Of Excellence-Data Warehousing

  • Reusable TransformationA Transformation is said to be in reusable mode when multiple instances of the same transformation can be created.

    Reusable transformations can be used in multiple mappings.

    Creating Reusable transformations:Design it in the Transformation DeveloperPromote a standard transformation from the Mapping Designer.

    Center Of Excellence-Data Warehousing

  • MappletA mapplet is a reusable object that represents a set of transformations

    It allows to reuse transformation logic and can contain as many transformations as needed

    Mapplets can: Include source definitionsAccept data from sources in a mappingInclude multiple transformationsPass data to multiple pipelinesContain unused ports

    Center Of Excellence-Data Warehousing

  • Sample Mapplet in a Mapping

    Center Of Excellence-Data Warehousing

  • Mapplet - ComponentsEach mapplet must include the following: One Input transformation and/or Source Qualifier transformationAt least one Output transformation

    A Mapplet should contain exactly one of the following: Input transformation with at least one port connected to a transformation in the mapplet Source Qualifier transformation with at least one port connected to a source definition

    Center Of Excellence-Data Warehousing

  • Mapplet

    Center Of Excellence-Data Warehousing

  • Expanded MappletFor example, in the figure, the mapplet uses the Input transformation IN_CustID_FirstLastName to define mapplet input ports. The Input transformation is connected to one transformation, EXP_WorkaroundLookup, which passes data to two separate transformations

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Workflow Manager & Workflow Monitor

    Center Of Excellence-Data Warehousing

  • Workflow Manager & Workflow Monitor Server ManagerWorkflow ManagerWorkflow Monitor1. Task Developer 2. Workflow Designer 3. Worklet Designer

    1. Gantt Chart 2. Task View

    Center Of Excellence-Data Warehousing

  • Workflow ManagerThe Workflow Manager replaces the Server Manager in version 5.0. Instead of running sessions, you now create a process called the workflow in the Workflow Manager.

    A workflow is a set of instructions on how to execute tasks such as sessions, emails, and shell commands.

    A session is now one of the many tasks you can execute in the Workflow Manager.

    The Workflow Manager provides other tasks such as Assignment, Decision, and Events. You can also create branches with conditional links. In addition, you can batch workflows by creating worklets in the Workflow Manager.

    Center Of Excellence-Data Warehousing

  • Workflow Manager Screen ShotWorkflow Manager

    Center Of Excellence-Data Warehousing

  • Workflow Manager Tools

    Task Developer Use the Task Developer to create tasks you want to execute in the workflow. Workflow Designer Use the Workflow Designer to create a workflow by connecting tasks with links. You can also create tasks in the Workflow Designer as you develop the workflow. Worklet Designer Use the Worklet Designer to create a worklet.

    Center Of Excellence-Data Warehousing

  • Workflow Tasks

    Command. Specifies a shell command run during the workflow. Control. Stops or aborts the workflow. Decision. Specifies a condition to evaluate. Email. Sends email during the workflow. Event-Raise. Notifies the Event-Wait task that an event has occurred. Event-Wait. Waits for an event to occur before executing the next task. Session. Runs a mapping you create in the Designer.Assignment. Assigns a value to a workflow variable. Timer. Waits for a timed event to trigger.

    Center Of Excellence-Data Warehousing

  • Create Task

    Center Of Excellence-Data Warehousing

  • Workflow MonitorPowerCenter 6.0 provides a new tool, the Workflow Monitor, to monitor workflow, worklets, and tasks.

    The Workflow Monitor displays information about workflows in two views: 1. Gantt Chart view 2. Task view. You can monitor workflows in online and offline mode.

    Center Of Excellence-Data Warehousing

  • Workflow Monitor Gantt Chart

    Center Of Excellence-Data Warehousing

  • Workflow Monitor Task View

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Performance Tuning

    Center Of Excellence-Data Warehousing

  • Performance Tuning First step in performance tuning is to identify the performance bottleneck in the following order:TargetSourceMappingSessionSystem

    The most common performance bottleneck occurs when the Informatica Server writes to a target database.

    Center Of Excellence-Data Warehousing

  • Target Bottlenecks IdentifyingA target bottleneck can be identified by configuring the session to write to a flat file target.

    OptimizingDropping Indexes and Key Constraints before loading.Increasing commit intervals.Use of Bulk Loading / External Loading.

    Center Of Excellence-Data Warehousing

  • Source Bottlenecks IdentifyingAdd a filter condition after Source qualifier to false so that no data is processed past the filter transformation. If the time it takes to run the new session remains about the same, then there is a source bottleneck. In a test mapping remove all the transformations and if the performance is similar, then there is a source bottleneck.

    OptimizingOptimizing the Query by using hints.Use informatica Conditional Filters if the source system lacks indexes.

    Center Of Excellence-Data Warehousing

  • Mapping Bottlenecks IdentifyingIf there is no source bottleneck, add a Filter transformation in the mapping before each target definition. Set the filter condition to false so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, there is a mapping bottleneck.

    OptimizingConfigure for Single-Pass readingAvoid unnecessary data type conversions.Avoid database reject errors.Use Shared Cache / Persistant Cache

    Center Of Excellence-Data Warehousing

  • Session Bottlenecks IdentifyingIf there is no source, Target or Mapping bottleneck, then there may be a session bottleneck.Use Collect Performance Details. Any value other than zero in the readfromdisk and writetodisk counters for Aggregator, Joiner, or Rank transformations indicate a session bottleneck. Low (0-20%) BufferInput_efficiency and BufferOutput_efficiency counter values also indicate a session bottleneck.

    OptimizingIncrease the number of partitions. Tune session parameters. DTM Buffer Size (6M 128M)Buffer Block Size (4K 128K)Data (2M 24 M )/ Index (1M-12M) Cache SizeUse incremental Aggregation if possible.

    Center Of Excellence-Data Warehousing

  • Session Bottlenecks - MemoryConfigure the index and data cache memory for the Aggregator, Rank, and Joiner transformations in the Configuration Parameters dialog boxThe amount of memory you configure depends on partitioning, the transformation that requires the largest cache, and how much memory cache and disk cache you want to use

    Center Of Excellence-Data Warehousing

  • Session Bottlenecks - CacheWhen you cache the Lookup transformation, the Informatica Server builds a cache in memory when it processes the first row of data in the transformationThe Informatica Server builds the cache and queries it for each row that enters the transformationIt processes data according to the way you configure the Lookup transformation

    Center Of Excellence-Data Warehousing

  • Incremental AggregationFirst Run creates idx and dat files.Second Run performs the following actions: For each i/p record, the Server checks historical information in the index file for a corresponding group, then: If it finds a corresponding group, it performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental changeIf it does not find a corresponding group, it creates a new group and saves the record dataWhen writing to the target Informatica Server Updates modified aggregate groups in the target Inserts new aggregate data Deletes removed aggregate data Ignores unchanged aggregate data Saves modified aggregate data in the index and data files

    Center Of Excellence-Data Warehousing

  • Incremental AggregationYou can find options for incremental aggregation on the Transformations tab in the session propertiesThe Server Manager displays a warning indicating the Informatica Server overwrites the existing cache and a reminder to clear this option after running the session

    Center Of Excellence-Data Warehousing

  • System Bottlenecks IdentifyingIf there is no source, Target, Mapping or Session bottleneck, then there may be a system bottleneck.Use system tools to monitor CPU usage, memory usage, and paging.On Windows :- Task Manager On Unix Systems toots like sar, iostat. For Eg: sar u (%usage on user, idle time, i/o waiting time)

    OptimizingImprove network speed. Improve CPU performanceCheck hard disks on related machines Reduce Paging

    Center Of Excellence-Data Warehousing

  • PMCMDCan use the command line program pmcmd to communicate with the Informatica ServerCan perform the following actions with pmcmd: Determine if the Informatica Server is runningStart sessions and batchesStop sessions and batches Recover sessionsStop the Informatica ServerCan configure repository usernames and passwords as environmental variables with pmcmdcan also customize the way pmcmd displays the date and time on the machine running the Informatica Serverpmcmd returns zero on success and non-zero on failureYou can use pmcmd with operating system scheduling tools like cron to schedule sessions, and you can embed pmcmd into shell scripts or Perl programs to run or schedule sessions

    Center Of Excellence-Data Warehousing

  • PMCMDNeed the following information to use pmcmd: Repository usernameRepository passwordConnection type - The type of connection from the client machine to the Informatica Server Port or connection - The TCP/IP port number or IPX/SPX connection (Windows NT/2000 only) to the Informatica ServerHost name - The machine hosting the Informatica Server Session or batch name - The names of any sessions or batches you want to start or stopFolder name - The folder names for those sessions or batches Parameter file

    Center Of Excellence-Data Warehousing

  • Commit PointsA commit interval is the interval at which the Informatica Server commits data to relational targets during a sessionThe commit point can be a factor of the commit interval, the commit interval type, and the size of the buffer blocksThe commit interval is the number of rows you want to use as a basis for the commit pointThe commit interval type is the type of rows that you want to use as a basis for the commit pointCan choose between the following types of commit intervalTarget-based commitSource-based commitDuring a source-based commit session, the Informatica Server commits data to the target based on the number of rows from an active source in a single pipeline

    Center Of Excellence-Data Warehousing

  • Commit PointsDuring a target-based commit session, the Informatica Server continues to fill the writer buffer after it reaches the commit intervalWhen the buffer block is filled, the Informatica Server issues a commit commandAs a result, the amount of data committed at the commit point generally exceeds the commit interval

    Center Of Excellence-Data Warehousing

  • Commit PointsDuring a source-based commit session, the Informatica Server commits data to the target based on the number of rows from an active source in a single pipelineThese rows are referred to as source rowsA pipeline consists of a source qualifier and all the transformations and targets that receive data from the source qualifierAn active source can be any of the following active transformations: Advanced External Procedure Source Qualifier Normalizer Aggregator Joiner Rank Mapplet, if it contains one of the above transformations

    Center Of Excellence-Data Warehousing

  • Commit PointsWhen the Informatica Server runs a source-based commit session, it identifies the active source for each pipeline in the mappingThe Informatica Server generates a commit row from the active source at every commit intervalWhen each target in the pipeline receives the commit row, the Informatica Server performs the commit

    Center Of Excellence-Data Warehousing

  • Commit Points

    Center Of Excellence-Data Warehousing

  • Multiple ServersYou can register multiple PowerCenter Servers with a PowerCenter repositoryCan run these servers at the same timeCan distribute the repository session load across available servers to improve overall performanceCan use the Server Manager to administer and monitor multiple servers With multiple Informatica Servers, you need to decide which server you want to run each session and batchYou can register and run only one PowerMart Server in a local repositoryCannot start a PowerMart Server if it is registered in a local repository that has multiple servers registered to it

    Center Of Excellence-Data Warehousing

  • Multiple ServersWhen attached to multiple servers, you can only view, or monitor, one Informatica Server at a time, but you have access to all the servers in the repository

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

  • Debugger

    Center Of Excellence-Data Warehousing

  • Debugger Can debug a valid mapping to gain troubleshooting information about data and error conditions

    To debug a mapping, you configure and run the Debugger from within the Mapping Designer

    When you run the Debugger, it pauses at breakpoints and allows you to view and edit transformation output data

    After you save a mapping, you can run some initial tests with a debug session before you configure and run a session in the Server Manager

    Center Of Excellence-Data Warehousing

  • Debugger

    Center Of Excellence-Data Warehousing

  • DebuggerCan Use the following process to debug a mapping: Create breakpointsConfigure the DebuggerRun the DebuggerMonitor the DebuggerDebug logSession logTarget windowInstance windowModify data and breakpoints

    A breakpoint can consist of an instance name, a breakpoint type, and a condition

    Center Of Excellence-Data Warehousing

  • DebuggerAfter you set the instance name, breakpoint type, and optional data condition, you can view each parameter in the Breakpoints section of the Breakpoint Editor

    Center Of Excellence-Data Warehousing

  • Questions

    Center Of Excellence-Data Warehousing

    Global repository. The global repository is the hub of the domain. Use the global repository to store common objects that multiple developers can use through shortcuts. These objects may include operational or application source definitions, reusable transformations, mapplets, and mappings. Local repositories. A local repository is within a domain that is not the global repository. Use local repositories for development. From a local repository, you can create shortcuts to objects in shared folders in the global repository. These objects typically include source definitions, common dimensions and lookups, and enterprise standard transformations. You can also create copies of objects in non-shared folders.

    The PowerCenter Server connects to the following Informatica platform components: PowerCenter Client Repository Server Repository Agent Source and target databasesPower Center SERVER and REP SERVERIt connects to the Repository Server and Repository Agent to retrieve workflow and mapping metadata from the repositorydatabase. When the PowerCenter Server requests a repository connection from the RepositoryServer, the Repository Server starts and manages the Repository Agent. The Repository Server then re-directs the PowerCenter Server to connect directly to the Repository Agent.Power Center SERVER and Workflow ManagerThe Workflow Manager communicates directly with the PowerCenter Server over a TCP/IP connection. The Workflow Manager communicates directly with the PowerCenter Server each time you schedule or edit a workflow, display workflow details, and request workflow and session logs. Note: You create the connection by defining the port number in the Workflow Manager and the PowerCenter Server configuration. Use the Workflow Manager to register the PowerCenter Server in the repository.For a session, the PowerCenter Server holds the connection as long as it needs to read datafrom source tables or write data to target tables.To prevent loss of information during data transfer, the PowerCenter Server, PowerCenterClient, Repository Server, Repository Agent, and repository database must have compatiblecode pages.The Figure shows the processing path between the Informatica Server, repository, source, and targetThe Informatica Server can combine data from different platforms and source types. For example, you can join data from a flat file and an Oracle source, and write the transformed data to a Microsoft SQL Server database. When a session starts, the Informatica Server retrieves mapping and session metadata from the repository to extract data from the source, transform it, and load it into the target. The Informatica Server can combine data from different platforms and source types

    The Load Manager is the primary Informatica Server processThe Load Manager holds the connection to the repository for the value set in the Informatica Server configuration, LMStayConnectToRepositDurationWhen you start the Informatica Server. When you start the Informatica Server, the Load Manager launches and queries the repository for a list of sessions configured to run on the Informatica Server. When you configure a session. When you add, update, or schedule a session in the Server Manager, the repository stores all the session metadata. The Load Manager maintains a list of sessions and session start times. When a session starts. When a session starts, the Load Manager fetches the session information from the repository to perform the validations and verifications prior to starting the DTM process. The execute lock allows the Informatica Server to run the session and prevents you from starting the session again until it completes. If the session is already locked, the Informatica Server cannot start the session. A session may be locked if it is already running, or if an error occurred during the previous run that prevented the repository from releasing the execute lock.

    Four transformation threads. The DTM creates one transformation thread for each partition. The DTM creates an additional transformation thread for partition for each Aggregator or Rank transformation. So, the DTM creates four transformation threads to process the mapping in Figure above.Standalone repository. A repository that functions individually, unrelated and unconnected to other repositories. Global repository. (PowerCenter only.) The centralized repository in a domain, a group of connected repositories. Each domain can contain one global repository. The global repository can contain common objects to be shared throughout the domain through global shortcuts. Once created, you cannot change a global repository to a local repository. You can promote an existing local repository to a global repository. Local repository. (PowerCenter only.) A repository within a domain that is not the global repository. Each local repository in the domain can connect to the global repository and use objects in its shared folders. A folder in a local repository can be copied to other local repositories while keeping all local and global shortcuts intact.

    Customize the Repository Manager: Add, edit, and remove repositories in the Navigator Edit repository connection informationView or hide windows

    When restoring a repository, you must have a database available for the repository. You can restore the repository in a database that has a different code page from the original database, if the code pages are compatible. If a repository already exists at the location, the Repository Manager asks you to delete the repository before restoring a backup repository. If no repository exists, the Repository Manager creates a repository before restoring the backup repository.

    Source and target dependencies report (S2t_dep.rpt) - Shows the source and target dependencies as well as the transformations performed in each mapping

    Write lock - Created when you create or edit a repository object in a folder for which you have write permissionExecute lock - Created when you start a session or batch, or when the Informatica Server starts a scheduled session or batchSave lock - Created when you save information to the repositoryThe repository permits multiple read locks, one write lock, and one execute lock simultaneously on each repository object. This means that one user can edit a session while the Informatica Server runs the session, and another user can view the session properties at the same time. You can view existing locks in the repository in the Repository Manager. The Repository Manager provides two ways to view locks: Browse the repository. Use the Navigator and main windows to display the folders, versions, and objects in use. Show locks. Use a menu command to view all locks in the repository. This method provides more detailed information and allows you to sort your view of the locks.

    You must create a folder in a new repository before you can connect to the repository using the Designer or Workflow Manager.You can copy objects from one folder to another, so if you want to use an object in a non-shared folder, you can copy it into your working folder. If you work with multiple repositories, you can also copy objects across repositories. You can continue working in the new version, while preserving the older version. You might use versions to archive work while continuing with development.

    In your repository, you might create folders for each data warehouse development project, subject area, user, or type of metadata.If you can divide the data warehouse into different types of information, you might create a single folder for each type of data. For instance, when you set up the accounting data warehouse, you might create one folder for accounts payable and another for accounts receivable. You can create a folder for each repository user, designed to store work for that user only. If users work on separate projects, this technique avoids any problems that might occur if two people attempt to edit the same piece of metadata at the same time. You might create a different folder for each type of metadata (source definitions, target definitions, mappings, schemas, and reusable transformations) that you create through the Designer. Read permission - Allows you to view the folder as well as objects in the folderWrite permission - Allows you to create or edit objects in the folder Execute permission - Allows you to execute or schedule a session or batch in the folderWhen you create a folder, the Repository Manager makes you the owner of the folder by default, but you can select any user in the repository to be the folder owner. If the folder owner belongs to more than one repository group, you must then choose a repository group as the Owners Group. If the owner belongs to only one group, that group automatically becomes the Owners Group. If you have an object that you want to use in several mappings or across multiple folders, like an Expression transformation that calculates sales commissions, you can place the object in a shared folder. You can then use the object in other folders by creating a shortcut to the object. If, for example, you change the way your company calculates sales commissions, you can change the Expression transformation in the shared folder. Then all shortcuts to the Expression transformation inherit the changes.

    When you copy a folder from a global repository to a local repository in the same domain, the Repository Manager verifies whether a folder of the same name exists in the global repository. If it does not, the Repository Manager uses the folder name for the copied folder. If it does, the Repository Manager asks you to rename the folder. The Repository Manager preserves shortcuts to shared folders in the global repository, changing the local shortcuts to global shortcuts. When you copy both a shared folder and a non-shared folder with dependent shortcuts across repositories and then recopy the shared folder from the source repository, the shortcuts in the non-shared folder in the target repository point to the folder in the source repository. The shortcuts in the non-shared folder always point to the folder you select when you copy/replace a shared folder. Re-establish shortcuts. Maintain shortcuts to objects in shared folders. If the Repository Manager cannot re-establish shortcuts, it marks the affected mappings, mapplets, and sessions invalid in the repository and lists them in the Output window. Choose an Informatica Server. Use the Informatica Server to run all sessions and batches in the folder if a matching Server does not exist in the target repository. Copy connections. Copy database, FTP, and external loader connection information if matching connection names do not exist in the target repository. Copy persisted values. Copy the saved persisted values for mapping variables used in a session. Compare folders. Compare folders to determine how they are related with the compare folders functionality. Replace folders. Replace an existing folder, including all objects associated with the folder. The Repository Manager copies and replaces folders as a single transaction. If you cancel the copy before it completes, the Repository Manager rolls back all changes.

    Versions to compare. The wizard automatically selects pairs of versions with the same version number in each folder for comparison. You can also specify the versions to compare in each folder. Object types to compare. You can specify the object types to compare and display between folders. The Repository Manager compares objects based upon specific object attributes. See Table6-3 for a list of compared attributes for object types. Direction of comparison. The Repository Manager performs directional comparisons. A directional comparison checks the contents of one folder against the contents of the other. You can specify either one-way or two-way comparisons.

    Figure shows two folders in the same repository, Orders1 and Orders2. If you compare the folders using a one-way comparison, the source definition ORDER_ITEMS, present in Orders2 but not in Orders1, is not noted as a comparison. If you compare the folders using a two-way comparison, the absence of ORDER_ITEMS in Orders1 is noted as a difference.

    Because sessions and batches are not associated with version numbers, the version pairs specified in the Versions to compare list do not impact a comparison of sessions or batches. If you want to compare only sessions and batches, you can accept the default version pairs without affecting the outcome of the comparison. The Repository Manager does not compare the field attributes of the objects in the folders when performing the comparison. For example, if two folders that have matching source names and column or port names but differing port or column attributes, such as precision or datatype, the Repository Manager does not note these as different.

    Can delete a folder version to remove unnecessary versions from the repositoryBy archiving the contents of a folder into a version each time you reach a development landmark, you can access those versions if later edits prove unsuccessful. For example, you can create a folder version after completing a version of a difficult mapping, then continue working on the mapping. If you are unhappy with the results of subsequent work, you can revert to the previous version, then create a new version to continue development. Thus you keep the landmark version intact, but available for regression. When working with multiple versions, make sure you have the appropriate version active. The repository saves version information by workspace, so if someone else uses your machine and changes the active version, that version remains active on your machine until changedExporting and importing an object is similar to copying an object from one folder or repository to another folder or repository. When you copy objects between folders or repositories, you must be connected to both repositories simultaneously. However, when you export an object from one repository and import the object into another repository, you do not need to connect to both repositories simultaneously. You might want to export an object in any of the following circumstances: You want to copy an object between two repositories, but you cannot connect to both repositories from the same client. Export the object and electronically transfer the XML file to the target machine. Then import the object from the XML file into the target repository. You previously copied a mapping or mapplet that uses a reusable transformation to another repository. Then later you changed the reusable transformation. Instead of copying the entire mapping or mapplet again, you can export and import the reusable transformation. You want to export an object from your development repository and deploy it in the production repository. You have an invalid session that you need to troubleshoot. Export the invalid session and its associated mapping, electronically transfer the XML file to someone else for troubleshooting.

    To import a source definition: In the Source Analyzer, choose Sources-Import from Database. Select the ODBC data source used to connect to the source database. If you need to create or modify an ODBC data source, click the Browse button to open the ODBC Administrator. Create the appropriate data source and click OK. Select the new ODBC data source. Enter a database username and password to connect to the database. Note: The username must have the appropriate database permissions to view the object. You may need to specify the owner name for database objects you want to use as sources. Click Connect. If no table names appear or if the table you want to import does not appear, click All. Drill down through the list of sources to find the source you want to import. Select the relational object or objects you want to import. You can hold down the Shift key to select blocks of record sources within one folder, or hold down the Ctrl key to make non-consecutive selections within a folder. You can also select all tables within a folder by selecting the folder and clicking the Select All button. Use the Select None button to clear all highlighted selections. When you create a flat file source definition, you must define the properties of the file. The Source Analyzer provides a Flat File Wizard to prompt you for the above mentioned file properties.You can import fixed-width and delimited flat file source definitions that do not contain binary data. When importing the definition, the source file must be in a directory local to the client machine. In addition, the Informatica Server must be able to access all source files during the session.

    You can create the overall relationship, called a schema, as well as the target definitions, through wizards in the Designer. The Cubes and Dimensions Wizards follow common principles of data warehouse design to simplify the process of designing related targets.

    Some changes to target definitions can invalidate mappings. If the changes invalidate the mapping, you must open and edit the mapping. If the invalidated mapping is used in a session, you must validate the session.You can preview the data of relational target definitions in the Designer. This feature saves you time because you can browse the target data before you run a session or build a mapping. Edit target definitions to add comments or key relationships, or update them to reflect changed target definitions. When you change target definitions, the Designer propagates the changes to any mapping using that target

    Connectors - Connect sources, targets, and transformations so the Informatica Server can move the data as it transforms itA mapplet is a set of transformations that you build in the Mapplet Designer and can use in multiple mappingsWhen you edit and save a mapping, some changes cause the session to be invalid even though the mapping remains valid. The Informatica Server does not run invalid sessionsThe Designer marks a mapping invalid when it detects errors that will prevent the Informatica Server from executing the mappingThe Designer performs connection validation each time you connect ports in a mapping and each time you validate or save a mapping. At least one mapplet input port and output port is connected to the mapping. If the mapplet includes a Source Qualifier that uses a SQL override, the Designer prompts you to connect all mapplet output ports to the mapping. You can validate an expression in a transformation while you are developing a mapping. If you did not correct the errors, the Designer writes the error messages in the Output window when you save or validate the mapping. When you validate or save a mapping, the Designer verifies that the definitions of the independent objects, such as sources or mapplets, match the instance in the mapping. If any of the objects change while you configure the mapping, the mapping might contain errors. Example of an active transformation is a Filter transformation that removes rows that do not meet the configured filter condition.Example of a passive transformation is an an Expression transformation that performs a calculation on data and passes all rows through the transformationAn unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.

    The Informatica Server performs aggregate calculations as it reads, and stores necessary data group and row data in an aggregate cacheAggregate expression - Entered in an output port, can include non-aggregate expressions and conditional clauses Group by port - Indicates how to create groups. can be any input, input/output, output, or variable portSorted Input option - Use to improve session performance. To use Sorted Input, you must pass data to the Aggregator transformation sorted by group by port, in ascending or descending order Aggregate cache - Aggregator stores data in the aggregate cache until it completes aggregate calculations. It stores group values in an index cache and row data in data cacheYou can enter multiple expressions in a single Expression transformation. As long as you enter only one expression for each output port, you can create any number of output ports in the transformation. In this way, you can use one Expression transformation rather than creating separate transformations for each calculation that requires the same set of data.

    As an active transformation, the Filter transformation may change the number of rows passed through it. A filter condition returns TRUE or FALSE for each row that passes through the transformation, depending on whether a row meets the specified condition. Only rows that return TRUE pass through this transformation. Discarded rows do not appear in the session log or reject files. To maximize session performance, include the Filter transformation as close to the sources in the mapping as possible. Rather than passing rows you plan to discard through the mapping, you then filter out unwanted data early in the flow of data from sources to targets. You cannot concatenate ports from more than one transformation into the Filter transformation. The input ports for the filter must come from a single transformation. The Filter transformation does not allow setting output default values.

    Allows to join sources that contain binary dataTo join more than two sources in a mapping, add additional Joiner transformationsAn input transformation is any transformation connected to the input ports of the current transformation. Specify one of the sources as the master source, and the other as the detail source. This is specified on the Properties tab in the transformation by clicking the M column. When you add the ports of a transformation to a Joiner transformation, the ports from the first source are automatically set as detail sources. Adding the ports from the second transformation automatically sets them as master sources. The master/detail relation determines how the join treats data from those sources based on the type of join. For example, you might want to join a flat file with in-house customer IDs and a relational database table that contains user-defined customer IDs. You could import the flat file into a temporary database table, then perform the join in the database. However, if you use the Joiner transformation, there is no need to import or create temporary tables.

    Can configure the lookup transformation to be connected or unconnected, cached or uncachedIf you call the unconnected Lookup from an update strategy or filter expression, you are generally checking for null values. In this case, the return port can be anything. If, however, you call the Lookup from an expression performing a calculation, the return value needs to be the value you want to include in the calculation.

    Connected and unconnected transformations receive input and send output in different ways. Sometimes you can improve session performance by caching the lookup table. If you cache the lookup table, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Informatica Server inserts rows into the cache during the session. Informatica recommends that you cache the target table as the lookup. This enables you to look up values in the target and insert them if they do not exist.

    The Router transformation is more efficient when you design a mapping and when you run a sessionFor example, to test data based on three conditions, you only need one Router transformation instead of three filter transformations to perform this task. Likewise, when you use a Router transformation in a mapping, the Informatica Server processes the incoming data only once. When you use multiple Filter transformations in a mapping, the Informatica Server processes the incoming data for each transformation

    The Informatica Server generates a value each time a row enters a connected transformation, even if that value is not used. When NEXTVAL is connected to the input port of another transformation, the Informatica Server generates a sequence of numbers. When CURRVAL is connected to the input port of another transformation, the Informatica Server generates the NEXTVAL value plus one.

    Connect NEXTVAL to multiple transformations to generate unique values for each row in each transformation. For example, you might connect NEXTVAL to two target tables in a mapping to generate unique primary key values. The Informatica Server creates a column of unique primary key values for each target table. If you want the same generated value to go to more than one target that receives data from a single preceding transformation, you can connect a Sequence Generator to that preceding transformation. This allows the Informatica Server to pass unique values to the transformation, then route rows from the transformation to targets.

    The Source Qualifier displays the transformation datatypes. The transformation datatypes in the Source Qualifier determine how the source database binds data when you import it. Do not alter the datatypes in the Source Qualifier. If the datatypes in the source definition and Source Qualifier do not match, the Designer marks the mapping invalid when you save it.

    In the mapping shown above, although there are many columns in the source definition, only three columns are connected to another transformation. In this case, the Informatica Server generates a default query that selects only those three columns: SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME FROM CUSTOMERSWhen generating the default query, the Designer delimits table and field names containing the slash character (/) with double quotes.

    It determines how to handle changes to existing recordsWhen you design your data warehouse, you need to decide what type of information to store in targets. As part of your target table design, you need to determine whether to maintain all the historic data or just the most recent changes. For example, you might have a target table, T_CUSTOMERS, that contains customer data. When a customer address changes, you may want to save the original address in the table, instead of updating that portion of the customer record. In this case, you would create a new record containing the updated address, and preserve the original record with the old customer address. This illustrates how you might store historical information in a target table. However, if you want the T_CUSTOMERS table to be a snapshot of current customer data, you would update the existing customer record and lose the original address.

    The Rank transformation differs from the transformation functions MAX and MIN, in that it allows you to select a group of top or bottom values, not just one value. For example, you can use Rank to select the top 10 salespersons in a given territory. Or, to generate a financial report, you might also use a Rank transformation to identify the three departments with the lowest expenses in salaries and overhead. While the SQL language provides many functions designed to handle groups of data, identifying top or bottom strata within a set of rows is not possible using standard SQL functions. Allows to create local variables and write non-aggregate expressionsDuring a session, the Informatica Server compares an input row with rows in the data cache. If the input row out-ranks a stored row, the Informatica Server replaces the stored row with the input row. If the Rank transformation is configured to rank across multiple groups, the Informatica Server ranks incrementally for each group it finds.

    Variable ports cannot be input or output ports. They pass data within the transformation only.You can designate only one Rank port in a Rank transformation. The Rank port is an input/output port. You must link the Rank port to another transformationLimitations exist on passing data, depending on the database implementationStored procedures are stored and run within the database. Not all databases support stored procedures, and database implementations vary widely on their syntax. You might use stored procedures to: Drop and recreate indexes. Check the status of a target database before moving records into it. Determine if enough space exists in a database. Perform a specialized calculation. Database developers and programmers use stored procedures for various tasks within databases, since stored procedures allow greater flexibility than SQL statements. Stored procedures also provide error handling and logging necessary for mission critical tasks. Developers create stored procedures in the database using the client tools provided with the database.

    The stored procedure issues a status code that notifies whether or not the stored procedure completed successfully

    You can run several Stored Procedure transformations in different modes in the same mapping. For example, a pre-load source stored procedure can check table integrity, a normal stored procedure can populate the table, and a post-load stored procedure can rebuild indexes in the database. However, you cannot run the same instance of a Stored Procedure transformation in both connected and unconnected mode in a mapping. You must create different instances of the transformation. If the mapping calls more than one source or target pre- or post-load stored procedure in a mapping, the Informatica Server executes the stored procedures in the execution order that you specify in the mapping.

    NewLookupRow. The Designer automatically adds this port to a Lookup transformation configured to use a dynamic cache. Indicates whether or not the row is in lookup cache. To keep the lookup cache and the target table synchronized, you want to pass rows to the target when the NewLookupRow value is equal to 1. Associated Port. Associate lookup ports with either an input/output port or a sequence ID. The Informatica Server uses the data specified in the associated ports to insert into the lookup cache when it does not find a row in the lookup cache. If you associate a sequence ID, the Informatica Server generates a primary key for the inserted row in the lookup cache.

    The Informatica Server builds the cache when it processes the first request lookup request. It queries the cache based on the lookup condition for each row that passes into the transformation. When the Informatica Server receives a new row (a row that is not in the cache), it inserts the row into the cache. You can configure the transformation to insert rows into the cache based on input ports or generated sequence IDs. The Informatica Server flags the row as new. When the Informatica Server receives an existing row (a row that is in the cache), it flags the row as existing. The Informatica Server does not insert the row into the cache. Use a Router or Filter transformation with the dynamic Lookup transformation to route new rows to the cached target table. You can route existing rows to another target table, or you can drop them. When you partition a source that uses a dynamic lookup cache, the Informatica Server creates one memory cache and one disk cache for each transformation.

    You can pass a value from a port, literal string or number, variable, Lookup transformation, Stored Procedure transformation, External Procedure transformation, or the results of another expression. Separate each argument in a function with a comma. Except for literals, the transformation language is not case-sensitive. Except for literals, the Designer and Informatica Server ignore spaces. The colon (:), comma (,), and period (.) have special meaning and should be used only to specify syntax. The Informatica Server treats a dash (-) as a minus operator. If you pass a literal va