ibm infosphere datastage introduction online training
Post on 12-Jan-2017
892 Views
Preview:
TRANSCRIPT
For more details please contact us:US : +1 718 819 9361INDIA : +91 8099776681Email Us : sales@kerneltraining.com
Welcome to IBM Data Stage 9.1
2 http://kerneltraining.com/ibm-data-stage/
DATA WAREHOUSE A data warehouse is a copy of transaction data specifically
structured for querying and reporting. An expanded definition for data warehousing includes business
intelligence tools, tools to extract, transform and load data into the repository, and tools to manage and retrieve metadata.
This definition of the data warehouse focuses on data storage. A data warehouse can be normalized or de normalized. It can be a relational database, multidimensional database, flat file,
hierarchical database, object database, etc. Data warehouse data often gets changed. And data warehouses often focus on a specific activity or entity.
4 http://kerneltraining.com/ibm-data-stage/
Reasons for Dirty Data
Dummy Values Absence of Data Multipurpose Fields Cryptic Data Contradicting Data Inappropriate Use of Address Lines Violation of Business Rules Reused Primary Keys, Non-Unique Identifiers Data Integration Problems
5 http://kerneltraining.com/ibm-data-stage/
Data Cleansing
Source systems contain dirty data that must be cleansed
ETL software contains rudimentary data cleansing capabilities
Specialized data cleansing software is often used. Important for performing name and address correction and house holding functions
Leading data cleansing vendors include Vality (Integrity), Harte-Hanks (Trillium), and First logic (i.e. Centric)
8 http://kerneltraining.com/ibm-data-stage/
Data Stage
In its simplest form, Data Stage performs from source systems to target systems in batch and in real time. The data sources may include indexed files, sequential files, relational databases, archives, external data sources, enterprise applications and message queues.
9 http://kerneltraining.com/ibm-data-stage/
Data Stage
Data Stage Administrator
Data Stage Designer
Data Stage Director
The Data Stage client components are:
10 http://kerneltraining.com/ibm-data-stage/
Data Stage Administrator Designer Director
Specify general server defaults Add and delete projects Set project properties
Access Data Stage Repository by command interface
Use Data Stage Administrator to:
12 http://kerneltraining.com/ibm-data-stage/
Data Stage Administrator Designer Director
Specify how the data is extracted
Specify data transformations
Decode (de normalize) data going into the data mart using referenced lookups
Aggregate data Split data into
multiple outputs on the basis of defined constraints
Use Data StageDesigner to:
13 http://kerneltraining.com/ibm-data-stage/
Data Stage Administrator Designer DirectorUse Data stage Director to run, schedule, and monitor your Data Stage jobs. You can also gather statistics as the job runs. Also used for looking at logs for debugging purposes.
The Data Stage Director window is divided into two panes: The Job Category pane lists all of the jobs in the repository. Right pane shows one of three views: Status view, Schedule view, or
Log view.
15 http://kerneltraining.com/ibm-data-stage/
Frequently seen Status
1 Finished 2 Finished (see log) 9 Has been reset 11 Validated OK 12 Validated (see log) 21 Has been reset 99 Compiled 0 Running 3 Aborted 8 Failed validation 13 Failed validation 96 Aborted 97 Stopped 98 Not Compiled
16 http://kerneltraining.com/ibm-data-stage/
Data Stage:Getting Started
Set up a project – Before you can create any Data Stage jobs, you must set up your project by entering information about your data.
Create a job – When a Data Stage project is installed, it is empty and you must create the jobs you need in Data Stage Designer.
Define Table Definitions Develop the job – Jobs are designed and developed
using the Designer. Each data source, the data warehouse, and each processing step is represented by a stage in the job design. The stages are linked together to show the flow of data.
20 http://kerneltraining.com/ibm-data-stage/
Data Stage Designer Transformer Stage
The Transformer stage performs any data conversion required before the data is output to another stage in the job design.
After you are done, compile and run the job.
25
Call us: +91 8099776681Email: sales@kerneltraining.comhttp://kerneltraining.com/ibm-data-stage/
Questions ?
26
Call us: +91 8099776681Email: sales@kerneltraining.comhttp://kerneltraining.com/ibm-data-stage/
top related