sql server days 2014 - how to (not) torment your fellow ssis developer
DESCRIPTION
Session on SQL Server Days 2014 about SSIS best practices and performance improvements.TRANSCRIPT
HOW TO (NOT) TORMENT
KOEN VERBEECK
SQL SERVER DAYS 2014
YOUR FELLOW SSIS DEVELOPER
#sqlserverdays
ABOUT ME
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
• I am• an ETL / DWH developer/consultant• a Business Intelligence developer/consultant• an analyst• an architect• a BI (project) manager• someone else…
INTRODUCTION
• I have worked with SSIS• … never at all• for less than a year• for 1 to 5 years• for 5+ years
• ever worked on someone else’sSSIS package/project
INTRODUCTION
INTRODUCTION
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
LAYOUT
LAYOUT
LAYOUT
LAYOUT
LAYOUT
• your options?• Auto Layout (not the brightest kid of the class but it gets
you started)
• layout toolbar
Demoshow the layout tools
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
WHAT IS GOING ON?
WHAT IS GOING ON?
• let other people know what the package does (or is supposed to do)
• SSIS packages are just like any other regular code
• annotations already go a long way• an entire novel is not necessary
• especially use them if you did something unusual
Demo• column with filename• data load with clustered
index
WHAT IS GOING ON?
• give meaningful names to tasks / transformations
• try out a naming convention• Jamie Thomson’s list
WHAT IS GOING ON?
• document embedded code as well• T-SQL
• tip: add the name of the SSIS task in the first line of the code can easily be spotted in Profiler
• .NET in script tasks / components
• there are 3rd party doc tools
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
MORE “BEST PRACTICES”
• use source control• I mean like, right now• especially important in SSIS 2012+
• check out packages you are working on
• not the entire project…
• try to use only one version of BIDS/SSDT
MORE “BEST PRACTICES”
• supply a commit message when you check packages in
• this makes it easier to revert back to an earlier (working) version
MORE “BEST PRACTICES”
• allow for easy troubleshooting• enable logging
• logs in SQL Server are easy to query
• in SSIS 2012+, the SSISDB takes care of business
• select appropriate logging level
• use audit columns
• PackageID, InsertDate, UpdateDate …
MORE “BEST PRACTICES”
• develop package templates• ensures consistency across projects
• revise them from time to time
• useful for common “patterns”
• generate with BIML for extra awesomeness
• log to a central database
• useful for SSIS 2005-08R2, less for SSIS 2012+
• tie packages from different projects together
• e.g. all packages from the same ETL load
• makes it easier to analyze durations
Demo“upsert” load pattern
MORE “BEST PRACTICES”
• aim for restartability• KISS
• don’t create huge single packages
• rather create several smaller modular packages
• packages should be idempotent
• you should be able to execute them over and over again without issues
• in an ETL run, keep track of where you are
• especially when executing in parallel
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
PERFORMANCE CONSIDERATIONS
• blocking, semi-blocking & non-blocking
• avoid certain components
• sort, aggregate, merge and merge join
• use T-SQL instead
• synchronous vs asynchronous
• use Union All transformation only when needed
• avoid row-by-row transformations
• OLE DB command = EVIL
PERFORMANCE CONSIDERATIONS
Demoalternative design pattern Union All
PERFORMANCE CONSIDERATIONS
• beware the buffer• the data flow is a pipeline
• adjust buffer size to avoid back pressure
• find the bottleneck
• a bigger buffer is not always better
• be careful with data types
Demodata types and nvp
OUTLINE
LAYOUTINTRODUCTION WHAT IS GOING ON?
MORE “BEST PRACTICES”
PERFORMANCE CONSIDERATIONS
CONCLUSION
CONCLUSION
• if you care for your fellow SSIS dev
• pay attention to
• layout
• names of tasks/components
• documentation
• logging
• use source control• keep packages short, simple and idempotent• remember performance
• avoid asynchronous/blocking transformations
• check buffer size
RESOURCES
– Suggested Best Practises and naming conventions – Jamie Thomsonhttp://sqlblog.com/blogs/jamie_thomson/archive/2012/01/29/suggested-best-practises-and-naming-conventions.aspx
– SQL Server 2012 Integration Services Design Patterns – various authorshttp://www.amazon.com/Server-Integration-Services-Design-Patterns-ebook/dp/B00992OBHS
– MS SQL Server 2008 SSIS: Problem, Design, Solution – various authorshttp://www.amazon.com/Microsoft-Server-2008-Integration-Services/dp/0470525762
– Improve SSIS data flow buffer performance – Koen Verbeeckhttp://www.mssqltips.com/sqlservertip/3217/improve-ssis-data-flow-buffer-performance/
– Top 10 SQL Server Integration Services Best Practices – SQL CAThttp://blogs.msdn.com/b/sqlcat/archive/2013/09/16/top-10-sql-server-integration-services-best-practices.aspx
– Data Flow Performance Featureshttp://technet.microsoft.com/en-us/library/ms141031.aspx
– Semi-blocking Transformations in SSIS – Koen Verbeeckhttp://www.mssqltips.com/sqlservertip/3242/semiblocking-transformations-in-sql-server-integration-services-ssis/
– Non-blocking, Semi-blocking and Fully-blocking components – Jorg Kleinhttp://sqlblog.com/blogs/jorg_klein/archive/2008/02/12/ssis-lookup-transformation-is-case-sensitive.aspx
– Understanding Synchronous and Asynchronous Transformations – MSDNhttp://technet.microsoft.com/en-us/library/aa337074.aspx
Q & A
KOEN VERBEECK
SQL SERVER DAYS 2014
SQL Server Days would like to thank all of our sponsors!
THANKS FOR LISTENING
KOEN VERBEECK
SQL SERVER DAYS 2014
[email protected]@Ko_Verhttp://www.linkedin.com/in/kverbeeck