talend etl tool online training tutorial for beginners

15
For more details please contact us: US : +1 718 819 9361 INDIA : +91 8099776681 Email Us : [email protected] Welcome to TALEND OPEN STUDIO - Demonstration

Upload: kernel-training

Post on 15-Apr-2017

7.418 views

Category:

Education


3 download

TRANSCRIPT

For more details please contact us:US : +1 718 819 9361INDIA : +91 8099776681Email Us : [email protected]

Welcome to TALEND OPEN STUDIO - Demonstration

2 http://kerneltraining.com/talend/

ETL Extraction, transformation, and loading.

ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.

The first step in ETL process is mapping the data between source systems and target database(data warehouse or data mart).

The second step is cleansing of source data in staging area.

The third step is transforming cleansed source data and then loading into the target system.

Note that ETT (extraction, transformation, transportation) and ETM (extraction, transformation, move) are sometimes used instead of ETL.

3 http://kerneltraining.com/talend/

Glossary of ETL Source System :

A database, application, file, or other storage facility from which the data in a data warehouse is derived.

Mapping: The definition of the relationship and data flow between source and

target objects.

Metadata Data that describes data and other structures, such as objects,

business rules, and processes. For example, the schema design of a data warehouse is typically stored in a repository as metadata, which is used to generate scripts used to build and populate the data warehouse. A repository contains metadata.

Staging Area A place where data is processed before entering the warehouse.

4 http://kerneltraining.com/talend/

Glossary of ETL(Cont..) Cleansing

The process of resolving inconsistencies and fixing the anomalies in source data, typically as part of the ETL process.

Transformation The process of manipulating data. Any manipulation beyond

copying is a transformation. Examples include cleansing, aggregating, and integrating data from multiple sources.

Transportation The process of moving copied or transformed data from a source

to a data warehouse.

Target System A database, application, file, or other storage facility to which the

"transformed source data" is loaded in a data warehouse.

5 http://kerneltraining.com/talend/

ETL tools List of the most popular ETL tools: Informatics - Power Center

IBM - Web sphere Data Stage(Formerly known as Ascential Data Stage)

SAP - Business Objects Data Integrator

IBM – Cogno’s Data Manager (Formerly known as Cogno’s Decision Stream)

Microsoft - SQL Server Integration Services

SAS - Data Integration Studio

AB Initio

Pentaho - Pentaho Data Integration

Talend - Talend Open Studio

6 http://kerneltraining.com/talend/

What is Talend ..? Talend is the first provider of open source data integration software.

Main product is Talend Open Studio.

After three years of intense research and development investment the first version of that software was released in 2006.

It is an Open Source project for data integration based on Eclipse RCP(Rich Client Platform) that primarily supports ETL-oriented implementations and is provided for on-premises deployment as well as in a software-as-a-service (SaaS) delivery model.

Talend Open Studio is mainly used for integration between operational systems, as well as for ETL (Extract, Transform, Load) for Business Intelligence and Data Warehousing

Talend Open Studio is the most open, innovative and powerful data integration solution on the market today.

7 http://kerneltraining.com/talend/

Talend Open Studio Powerful product for data integration.

Open source.

Has easy to use graphical representation development environment.

Business modeling

Metadata-driven design and execution

Real-time debugging

Has prebuilt connectors for all source and target systems.

Quality support available through worldwide users.

8 http://kerneltraining.com/talend/

Talend Open Studio extensions Talend Integration Suite - The first Open Source

enterprise data integration solution, Talend Integration Suite supports the tough requirements of enterprise development, and scales to the highest levels of data volumes and process complexity.

Talend On Demand - The industry's first data integration Software as a Service (SaaS), Talend On Demand consolidates Talend Open Studio metadata and project information in an online, shared repository hosted by Talend.

Talend Data Quality - The first open source data quality solution with enterprise-grade features and technical support, Talend Data Quality is a graphical data quality management environment that processes data, such as addresses, phone numbers, spellings, synonyms and abbreviations. Talend Data Quality includes both data profiling and data cleansing capabilities.

9 http://kerneltraining.com/talend/

Talend Open Studio extensions(Cont..) Talend ESB - Talend enterprise service bus (ESB) solutions

integrate multiple complementary open source technologies into single packages that are pre-configured for fast and easy deployment across a range of environments.

Talend Big Data Integration - Talend Open Studio for Big Data greatly simplifies the process of working with Hadoop, Apache's open source Map Reduce implementation that's rapidly become the leading framework for computational processing of massive data sets. With Talend's open source big data integration software, you can move data into HDFS or Hive, perform operations on it, and extract it without having to do any coding.

10 http://kerneltraining.com/talend/

Talend Open Studio extensions(Cont..) Talend ESB - Talend enterprise service bus (ESB) solutions

integrate multiple complementary open source technologies into single packages that are pre-configured for fast and easy deployment across a range of environments.

Talend Big Data Integration - Talend Open Studio for Big Data greatly simplifies the process of working with Hadoop, Apache's open source Map Reduce implementation that's rapidly become the leading framework for computational processing of massive data sets. With Talend's open source big data integration software, you can move data into HDFS or Hive, perform operations on it, and extract it without having to do any coding.

11 http://kerneltraining.com/talend/

Training Objectives Upon completing this training, you will be able to:

Describe the role of Talend in Data Integration

Explain the functionalities of Talend

List the various Components in Talend

Recall the ETL perspective

Configure and install Talend

Learn about all the different concepts of Talend

Build jobs based on various real-time requirements

12 http://kerneltraining.com/talend/

Few Screenshots:

13 http://kerneltraining.com/talend/

Few Screenshots(Cont..)

14

Call us: +91 8099776681Email: [email protected]://kerneltraining.com/talend/

Questions ?

15

Call us: +91 8099776681Email: [email protected]://kerneltraining.com/talend/