ssis for the disinterested
DESCRIPTION
Brian Garraty @ NULLgarity. SSIS for the Disinterested. Who I am. Brian D. Garraty SQL Server DBA, Va Beach Public Schools HRSSUG Leadership Team Background in C++, VB, ASP, C# @ NULLgarity NULLgarity.wordpress.com >10 years experience with SSIS & DTS. Why I’ve Come to You Today. - PowerPoint PPT PresentationTRANSCRIPT
SSIS for the Disinterested
Brian Garraty@NULLgarity
Who I am
Brian D. Garraty SQL Server DBA, Va Beach Public
Schools HRSSUG Leadership Team Background in C++, VB, ASP, C# @NULLgarity NULLgarity.wordpress.com >10 years experience with SSIS &
DTS
Why I’ve Come to You Today SSIS is often
Misunderstood Disliked Considered difficult to learn Subject to ridicule
Why I’ve Come to You Today SSIS is actually
Fairly simple (mostly) Pretty good at a lot of things Fun The right set of tools for the job
(sometimes)
Success Requires
A desire to learn new things The time to learn new things Willingness to work through
frustrations Willingness to start over The right project
Who I Come to You As TodayBrian D. Garraty, SSIS Defender
Brian D. Garraty, SSIS Apologist
Who You Are
Type of Work You Do Experience with SSIS Rank your love of SSIS on Scale of 1
to 10 10 = God, Family, Country, SSIS 5 = Like it like a friend 1 = I want to fight it, really
Where we are headed…
Starting from scratch – 100 Level Control Flow Data Flow Configurations Dynamic SSIS
What you can do with SSIS What I have done with SSIS
Where we are headed…
Defense Will Rest If I knew then... Resources Questions
Out of Scope
SSIS in SQL Server Denali How to do this Stuff Demos
SSIS 101SQL Server Integration Services
History
Bundled Extract, Transform, Load (ETL) Platform
Rewrite of Data Transformation Services (DTS)
Released with SQL Server 2005
Dev Environment
Developed in Visual Studio (err…”Business Intelligence Development Studio”)
Solutions, Projects, Packages
Run!!!!!!
Solution Explorer
Package Level Connections
Variables
Runtime Environment
Packages are Deployed as files or to msdb Executed via dtexec Utility (or wrapper
to it)
SSIS Service Runs on the server Doesn’t do much
A New Paradigm
Control Flow Ordered workflow Isolated tasks Precedence Constraints Data flow tasks
Data Flow Move data from Point A to Point B Manipulate data along the way
Control FlowWhere You Build Your Workflow
Control Flow Picture
Control Flow Toolbox
Sequence Containers
Group related tasks Can be enabled/disabled Can serve as source for Precedence
Constraints, even when empty
Loop Containers
Precedence Constraints
Execute SQL Tasks
Script Task
Other Control Flow Tasks
Execute Process Task Send Mail Task Execute Package Task
Data Flow Task…
Control Flow Picture
Data FlowWhere You Manipulate Data
Data Flow Picture
Data Flow Sources
Data Flow Transformations
Data Destinations
Data Flow Picture - Big
Dynamic SSISWhat You See Isn’t Always What You Get
What can be dynamic?
Object property values Variable values
Expressions
Determined Dynamic Values Expressions evaluate at runtime Example: Use current date to build
unique log file name
Configurations
Instructed Dynamic Values XML file Example: Connection Strings
Dynamic SSIS
What you see may not be what you get…
But it is what you asked for!
What You Can Dowith SQL Server Integration Services
Disparate Data Sources
Join data from Oracle with data from SQL Server? No problem. No linked server. No OPENROWSET No xp_cmdshell No bcp
Discover Workloads
Have two SQL Servers with 1500 databases, give or take, and need to pull data from all? No problem. No linked server. No OPENROWSET No xp_cmdshell No bcp
Parallelize Workloads
Multiple lines == Multiple threads Often equals, at least
Offload Workloads
Pull data off OLTP systems Crunch it to your heart’s content Location agnostic
OLTP = Online Transactional Processing
Exploit Your Talents
Seemless integration with your comfort zone .NET Stored Procedures XML
The Random
Can you have it email me? Can you only email when X and Y but
not Z? This program needs to run when the
extract finishes but only if…
What I’ve Donewith SQL Server Integration Services
Hosted Data Warehouse Pull Extract data from numerous sources Stage to database Apply transforms and business logic Write final data to files Zip files and ftp to host Track each run
(Oh no!) System Integration
How will the data get from these 100 databases on these two servers to this one?
Can you not send any dups? Can you email us the dups? Can you track the dups that have
been fixed?
Active Directory Extract
Extract users, groups, & membership Load data into SQL Server Ensure it always works
The Defense RestsThe Dark Side of SSIS
GUI-Intensive
If you aren’t comfortable in a GUI-intensive IDE, SSIS will be a challenge
!WYSIWYG
But you do get what you asked for! WYSIWYAF
Expression Syntax
Inconsistent Difficult to Master Flying solo (No Intellisense)
Wound Tight
SSIS binds itself to your meta data and holds on tight
SSIS continually checks that things have not changed
Single column change in data source must trickle all the way down your data source
What’s the Difference?
Source Control? In my experience you can check things in, label them, and do “gets”.
Trying to do a Diff might induce panic
If needed, seek a third party
Tradeoff
The more dynamic you get, the less parallel you tend to get
Read only? If only!
Just viewing details of SSIS components often leads SSIS to want to make changes to the file
Code review often leads to check out the entire project
But It Works on My Machine! The server is not your machine Early on, budget some time to work
through issues upon deployment Drivers, rights, O/S, platform, etc.
may cause problems Be patient!
Misc Rants
No Undo (yet) Variable Scope is read-only Variable Scope is context-default
Audience?
If I Knew Then…Best Practices and Words of Caution
Multiple Package Projects Benefits:
Encapsulation Readability Collaboration
Deploy as Files
Packages in development must be files
You will have multiple file projects To minimize changes on deployment,
deploy as files More info: bit.ly/lSniJS
Root Folder Variable
String variable to store location of package
Use to build relative file paths Input files (xsd, raw, etc) Output files (xml, raw, etc) Connection strings (packages, log files,
etc) More info: bit.ly/iuLctD
Indirect Configurations
Store location of configuration file in environment variable
More info: bit.ly/ksAQOG
Recreate over Edit
In response to certain changes, delete
Take advantage of ease of creating a new item rather than suffer through unpleasant editing experience
Particularly true for file connections
Resources
The New ETL Paradigm, Jamie Thomson bit.ly/e17DUR
Jamie Thomson’s Blogs sqlblog.com/blogs/jamie_thomson consultingblogs.emc.com/jamiethomson
Resources (con’t)
SSIS Community Tasks and Components ssisctc.codeplex.com
Locally grown SSIS Training andyleonard.net
Questions?