architecture of integration services

9
Designing and developing the ETL System with SSIS

Upload: slava-kokaev

Post on 31-Oct-2014

10 views

Category:

Technology


6 download

DESCRIPTION

Firestarter SSIS01 architecture of integration services

TRANSCRIPT

Page 1: Architecture of integration services

Designing and developing the ETL System with SSIS

Page 2: Architecture of integration services

Overview

What is SQL Server Integration Services?

Architecture of an Integration Services

Architecture of an Integration Services Package

Page 3: Architecture of integration services

What is SQL Server Integration ServicesMicrosoft SQL Server Integration Services is a platform for building enterprise-level data integration and data transformations solutions. You use Integration Services to solve complex business problems by:

processing of data into a data mart or data warehousemove data from legacy systems into new systems applications migrationintegrate data from multiple systems by passing data back and forthextract or import data for sending to vendors or partnerscleaning and mining datamanaging SQL Server objects

Page 4: Architecture of integration services

Architecture of Integration Services

Native ManagedObject Model

Integration Services Runtime

Task

Task Container

Task Task

Data Flow Task

Task

Custom Applications

Command Line Utils

SSIS Designer

SSIS Wizards

Custom Tasks

Tasks

Log Providers

Data Sources

Integration Services Service

.dtsx File

msdb Database

Enumerators

Connection Managers

Event Handlers

Integration Services Data Flow

Data Flow TaskObject Model

Source

Transformation

Destination

Source

Transformation

Destination

Custom Data Flow Components

Data Flow Components

Control Flow Engine

Data Flow Engine

Package

Control FlowThe control flow is the workflow engine that contains control flow tasks, containers, and precedence constraints, which manage when tasks and containers execute.

Runtime engine

The Integration Services runtime saves the layout of packages, runs packages, and provides support for logging, breakpoints, configuration, connections, and transactions.

Integration Services ServiceThe Integration Services service lets you use SQL Server Management Studio to monitor running Integration Services packages and to manage the storage of packages.

Integration Services PackageA package is an organized collection of connections, control flow elements, data flow elements, event handlers, variables, and configurations, that you assemble using either the graphical design tools that SQL Server Integration Services provides, or build programmatically.

Integration Services TaskTasks are control flow elements that define units of work that are performed in a package control flow. An SQL Server Integration Services package is made up of one or more tasks. If the package contains more than one task, they are connected and sequenced in the control flow by precedence constraints.

Integration Services ContainersContainers are objects in SQL Server Integration Services that provide structure to packages and services to tasks. They support repeating control flows in packages, and they group tasks and containers into meaningful units of work. Containers can include other containers in addition to tasks.

Integration Services DesignerSSIS Designer is a graphical tool that you can use to create and maintain Integration Services packages. SSIS Designer is available in Business Intelligence Development Studio as part of an Integration Services project. SSIS includes additional tools, wizards, and command prompt utilities for running and managing Integration Services packages.

API or object modelThe SSIS object model includes managed application programming interfaces (API) for creating custom components for use in packages, or custom applications that create, load, run, and manage packages. Developer can write custom applications or custom tasks or transformations by using any common language runtime (CLR) compliant language.

Data Flow EngineThe Data Flow task encapsulates the DF engine. DF engine provides the in-memory buffers that move data from source to destination, and calls the sources that extract data from files and databases. The DF engine also manages the transformations that modify data, and the destinations that load data or make data available to other processes.

Data Flow Components Integration Services data flow components are the sources, transformations, and destinations that Integration Services includes. You can also include custom components in a data flow.

Page 5: Architecture of integration services

What is SSIS Package

A package is the core object within SQL Server Integration Services (SSIS) that contains the business logic to handle workflow and data processing. You use SSIS packages to extract data from sources and load it to destinations, and to handle the timing precedence of when data is processed.

SSIS Package contains:Control flow elementsDataflow elementsEvent HandlersData source Connections

Page 6: Architecture of integration services

Architecture of Integration Services Package

Package

Task

Task Container

Task Task

Data Flow Task

Task

Variables

Log Providers

Data Sources

Precedence Constrains

Connection Managers

Event Handlers

Data Flow

Source

Transformation

Destination

Source

Transformation

Destination

Control Flow Components

Data Flow Components

Control Flow Elements The control flow elements—tasks and containers—for building the control flow in a package. Control flow elements prepare or copy data, interact with other processes, or implement repeating workflow.

Variables The variables that can be used in expressions to dynamically update column values and property expressions, control execution of repeating control flows, and define the conditions that precedence constraints apply.

Precedence ConstrainsPrecedence constraints sequence the control flow elements into an ordered control flow and specify the conditions for executing tasks or containers.

Log ProvidersThe log providers that support logging of package run-time information such as the start time and the stop time of the package and its tasks and containers.

Connection ManagersThe connection managers that connect to different types of data sources to extract and load data.

Data SourcesThe Data Sources that connect to different types of data sources to extract and load data.

Event HandlersThe event handlers that run in response to the run-time events that packages, tasks, and containers raise.

Data Flow ComponentsThe data flow components—sources, transformations, and destinations—for building data flows in a package that extract, transform, and load data. Paths sequence the data flow components into an ordered data flow.

Page 7: Architecture of integration services

Extract, Transform, Load (ETL)

ETL is a process in Business Intelligence that:

Extract data from the source systems

Transform the data to convert it to a desired state

Load the data into the data warehouse

ETL System

OLTP SSIS DW OLAP

Page 8: Architecture of integration services

ETL Framework and Logical Architecture

Extract Data

Load Staging

Extract from Staging

Transform Data

Load Dimensions

Load Facts

Check System State

Process Cube

Log ETL Process

Send Notification

File System

Database

Cube

ETL Packages

OLTP

ETL Schema

STAGING Schema

DWH Schema

Page 9: Architecture of integration services

Resources

Architecture of Integration Services -http://msdn.microsoft.com/en-us/library/bb522498.aspx

Architecture of an Integration Services Package - http://msdn.microsoft.com/en-us/library/cc645924.aspx