dmdw 7. student presentation - pentaho data integration (kettle)
DESCRIPTION
7. ETL Project by Himanshu Joshi, Sandhya Narayan and Greta AlvarezTRANSCRIPT
ETL
Himanshu JoshiGreta Alvarez
Sandhya Narayan
What is ETL?
Extracting data from outside sourcesTransforming it to fit operational needsLoading it into the end target (DB)
Extracting the Excel
Errors: Nulll entry
Spelling errorsSpace at the end of entriesStart at the end of entriesDuplicity
Normalize/Transform??
GOAL:
Decompose relations with anomalies in order to produce smaller, well-structured relations. Involves dividing large tables into smaller (and less redundant) tables and defining relationships between them.
Tools tried & USED
MS SQL SERVER MY SQL
Clover ETL DesignerAdvanced ETL Processor
Pentaho DI
Pentaho Data Integration
Power Extraction, Transformation and Loading (ETL) capabilities using an innovative, metadata-driven approach. With an intuitive, graphical, drag and drop design environment, and a proven, scalable, standards-based architecture.
http://kettle.pentaho.com/
Why Pentaho Data Integration?
Open SourceETL supportedUser FriendlyEasy to use
DEMO
Don‘t try to reinvent the wheel!!
Just Use it ;)