etl informatica concepts new

Upload: shashank-gangadharabhatla

Post on 04-Jun-2018

228 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/13/2019 ETL Informatica Concepts New

    1/16

    Fidelity Confidential Information

    1

    Informatica Training

    Concepts

    Informatica Technical Training Developer level 1

  • 8/13/2019 ETL Informatica Concepts New

    2/16

    Informatica PowerCenter - Overview

    Informatica PowerCenter architecture is used to achieve the extract, transform andload of data.

    PowerCenter provides an environment that allows you to load data into a

    centralized location, such as a data mart, data warehouse, or operational data

    store (ODS).

    You can extract data from multiple sources, transform the data according to

    business logic you build in the client application, and load the transformed data

    into file and relational targets

  • 8/13/2019 ETL Informatica Concepts New

    3/16

    Fidelity Confidential Information

    3

    Informatica PowerCenter - Components

    PowerCenter repository.The PowerCenter repository is at the center ofthe PowerCenter suite. You create a set of metadata tables within therepository database that the PowerCenter applications and tools access.The PowerCenter Client and Server access the repository to save andretrieve metadata.

    PowerCenter Repository Server.The PowerCenter Repository Servermanages connections to the repository from client applications. It inserts,updates, and fetches objects from the repository database tables. It alsomaintains object consistency.

    PowerCenter Client.Use the PowerCenter Client to manage users,define sources and targets, build mappings and mapplets with the

    transformation logic, and create workflows to run the mapping logic. ThePowerCenter Client has the following client applications: RepositoryManager, Repository Server Administration Console, Designer, WorkflowManager, and Workflow Monitor.

  • 8/13/2019 ETL Informatica Concepts New

    4/16

    Fidelity Confidential Information

    4

    Informatica PowerCenter - Components

    PowerCenter Server. The PowerCenter Server reads mapping and session

    information from the repository. It extracts data from the mapping sources andstores the data in memory while it applies the transformation rules that you

    configure in the mapping. The PowerCenter Server loads the transformed data

    into the mapping targets.

    Meta data . The simplest definition of metadata is that it is data about data. An

    item of metadata may describe an individual data item or a collection of data

    items.

  • 8/13/2019 ETL Informatica Concepts New

    5/16

    Fidelity Confidential Information

    5

    Sources & Targets

    Relational. Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and

    Teradata.

    File. Fixed and delimited flat file, COBOL file, and XML.

    Application. You can purchase additional PowerCenter Connect products to

    access business sources, such as PeopleSoft, SAP R/3, Siebel, IBM MQSeries,and TIBCO.

    Other. Microsoft Excel and Access.

  • 8/13/2019 ETL Informatica Concepts New

    6/16

    Fidelity Confidential Information

    6

    Repository

    The PowerCenter repository resides on a relational database.

    The repository database tables contain the instructions required to extract,transform, and load data

    PowerCenter Client applications access the repository database tables through theRepository Server.

    You add metadata to the repository tables when you perform tasks in thePowerCenter Client application, such as creating users, analyzing sources,developing mappings or mapplets, or creating workflows.

    The PowerCenter Server reads metadata created in the Client application whenyou run a workflow.

    The PowerCenter Server also creates metadata, such as start and finish times of a

    session or session status

  • 8/13/2019 ETL Informatica Concepts New

    7/16Fidelity Confidential Information 7

    Repository

    Global repository.The global repository is the hub of the domain. Use the global

    repository to store common objects that multiple developers can use throughshortcuts. These objects may include operational or Application source definitions,

    reusable transformations, mapplets, and mappings.

    Local repositories. A local repository is within a domain that is not the global

    repository. Use local repositories for development. From a local repository, you

    can create shortcuts to objects in shared folders in the global repository. These

    objects typically include source definitions, common dimensions and lookups, and

    enterprise standard transformations. You can also create copies of objects in non-

    shared folders.

    Version control. A versioned repository can store multiple copies, or versions, ofan object. Each version is a separate object with unique properties. PowerCenter

    version control features allow you to efficiently develop, test, and deploy metadata

    into production.

  • 8/13/2019 ETL Informatica Concepts New

    8/16Fidelity Confidential Information 8

    Repository Server

    The Repository Server manages repository connection requests from client

    applications. For each repository database registered with the Repository Server, itconfigures and manages a Repository Agent process. The Repository Server also

    monitors the status of running Repository Agents, and sends repository object

    notification messages to client applications.

    The Repository Agent is a separate, multi-threaded process that retrieves, inserts,

    and updates metadata in the repository database tables. The Repository Agent

    ensures the consistency of metadata in the repository by employing object locking.

  • 8/13/2019 ETL Informatica Concepts New

    9/16Fidelity Confidential Information 9

    PowerCenterClient

    Repository Server Administration Console. Use the Repository Server

    Administration console to administer the Repository Servers and repositories.

    Repository Manager.Use the Repository Manager to administer the metadatarepository. You can create repository users and groups, assign privileges andpermissions, and manage folders and locks.

    Designer.Use the Designer to create mappings that contain transformation

    instructions for the PowerCenter Server. Before you can create mappings, youmust add source and target definitions to the repository. The Designer has fivetools that you use to analyze sources, design target schemas, and build source-to-target mappings:

    Source Analyzer.Import or create source definitions.

    Warehouse Designer.Import or create target definitions.

    Transformation Developer.Develop reusable transformations to use inmappings.

    Mapplet Designer.Create sets of transformations to use in mappings.

    Mapping Designer.Create mappings that the PowerCenter Server uses toextract, transform, and load data.

  • 8/13/2019 ETL Informatica Concepts New

    10/16Fidelity Confidential Information 10

    PowerCenterClient

    Workflow Manager.Use the Workflow Manager to create, schedule, and run

    workflows. A workflow is a set of instructions that describes how and when to runtasks related to extracting, transforming, and loading data. The PowerCenter

    Server runs workflow tasks according to the links connecting the tasks. You can

    run a task by placing it in a workflow.

    Workflow Monitor.Use the Workflow Monitor to monitor scheduled and running

    workflows for each PowerCenter Server. You can choose a Gantt Chart or Task

    view. You can also access details about those workflow runs.

  • 8/13/2019 ETL Informatica Concepts New

    11/16Fidelity Confidential Information 11

    PowerCenterServer

    The PowerCenter Server reads mapping and session information from the

    repository. It extracts data from the mapping sources and stores the datain memory while it applies the transformation rules that you configure in

    the mapping. The PowerCenter Server loads the transformed data into the

    mapping targets.

    The PowerCenter Server can achieve high performance using symmetricmulti-processing systems. The PowerCenter Server can start and run

    multiple workflows concurrently. It can also concurrently process partitions

    within a single session. When you create multiple partitions within a

    session, the PowerCenter Server creates multiple database connections

    to a single source and extracts a separate range of data for eachconnection, according to the properties you configure.

  • 8/13/2019 ETL Informatica Concepts New

    12/16Fidelity Confidential Information 12

    Repository Manager

    Use the Repository Manager to administer your repositories. The Repository Manager

    allows you to navigate through multiple folders and repositories, and perform thefollowing tasks:

    Manage the repository.You can perform repository management functions, suchas copying, creating, starting, and shutting down repositories. You launch theRepository Server Administration Console to perform these functions.

    Implement repository security.You can create, edit, and delete repository usersand user groups. You can assign and revoke repository privileges and folderpermissions.

    Perform folder functions.You can create, edit, copy, and delete folders. Workyou perform in the Designer and Workflow Manager is stored in folders. If you want

    to share metadata, you can configure a folder to be shared.

    View metadata.You can analyze sources, targets, mappings, and shortcutdependencies, search by keyword, and view the properties of repository objects.

  • 8/13/2019 ETL Informatica Concepts New

    13/16Fidelity Confidential Information 13

    Workflow Manager

    The Workflow Manager consists of three tools to help you develop a workflow:

    Task Developer.Create tasks you want to accomplish in the workflow in the TaskDeveloper.

    Workflow Designer.Create a workflow by connecting tasks with links in theWorkflow Designer. You can also create tasks in the Workflow Designer as youdevelop the workflow.

    Worklet Designer.Create a worklet in the Worklet Designer. A worklet is an

    object that groups a set of tasks. A worklet is similar to a workflow, but withoutscheduling information. You can nest multiple worklets inside a workflow.

    Before you create a workflow, you must configure the following connectioninformation:

    PowerCenter Server connection.Register the PowerCenter Server with therepository before you can start it or create a session to run against it.

    Database connections.Create connections to source and target systems.

    Other connections.If you want to use external loaders or FTP, you configurethese connections in the Workflow Manager.

  • 8/13/2019 ETL Informatica Concepts New

    14/16Fidelity Confidential Information 14

    Workflow Monitor

    After you create a workflow, you run the workflow in the Workflow Manager

    and monitor it in the Workflow Monitor. The Workflow Monitor is a tool thatdisplays details about workflow runs in two views, Gantt Chart view and Task

    view. You can monitor workflows in online and offline modes.

    The Workflow Monitor consists of the following windows:

    Navigator window. Displays monitored repositories, servers, andrepositories objects.

    Output window. Displays messages from the PowerCenter Server.

    Time window. Displays progress of workflow runs.

    Gantt Chart view. Displays details about workflow runs in chronologicalformat.

    Task view. Displays details about workflow runs in a report format.

  • 8/13/2019 ETL Informatica Concepts New

    15/16Fidelity Confidential Information 15

    Debugging Tools

    Log Reader Tool:Log Reader helps the users to extract meaningful and necessaryinformation from PowerCenter log files(Session Logs, Repository server and agent logs,

    pmserver logs) by color coding the errors and informational messages.

    Performance Analyzer Tool : Performance Analyzer is a utility that analyzes the

    performance of mappings and session with the help of the mapping XMLs.

  • 8/13/2019 ETL Informatica Concepts New

    16/16Fidelity Confidential Information 16

    InformaticaResources

    www.infomatica.com Professional and Education Services

    my.informatica.com Tech support / Knowledgebase

    devnet.informatica.com discussion forms / webinars