ssis field notes darren green konesans ltd. ssis field notes after years of careful observation and...

23
SSIS Field Notes Darren Green Konesans Ltd

Upload: pierce-fitzgerald

Post on 16-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

SSIS Field Notes

Darren GreenKonesans Ltd

My Checklist

• Standards• Logging• Configuration

Common Princiapls

• Common design decisions or patterns• Logging• Frameworks• Custom vs Stock

Basic Standards

• Solution and Project structure• ETL vs ELT• Staging• Custom components

Source Control

• Products– Team Foundation Server– Subversion (Visual SVN)– Others…

• Issues– Cannot merge or use standard conflict resolution

• BIDS Helper- Smart Diff– TFS, SourceSafe,File

• BI Smart Diff– Subversion

Naming Conventions

• Prefix notation, e.g. DFT for Data Flow Task– http://consultingblogs.emc.com/jamiethomson/

• Expand Name property– SQL Create Year Staging Table

• Expand Description property– Create the year named staging table. Any existing

table will be dropped first.• Documentation tools are not clever

Logging

• Performance monitoring– Real-time monitoring– Trending to justify upgrades• Re-write problem packages• Upgrade hardware / network / environment

• Problem solving– Why did the job fail last night?

Logging Options

• Built in SSIS logging– Good for standard stuff, including errors

• Maintenance routine to prune records• Delete Info daily, Warnings weekly, Errors monthly

• Custom SSIS logging– Log process specific metrics - row counts– Event Handler or Control Flow

• Windows Logging– Event Log– Performance Monitor

Prune – Delete in ChunksSET @Count = 1000WHILE @Count IS NOT NULL AND @Count > 0BEGIN DELETE TOP (@Count) FROM dbo.sysdtslog90 WHERE StartTime < @MinStartTime SET @Count = @@ROWCOUNT

-- Pause for 0.2 seconds WAITFOR DELAY '000:00:00.200'END

Frameworks

• Consistent logging approach• Process or package state– Passing state between package processes– Saving state for the next run• Last extract date• Complete run only once per day

• Dynamic execution workflow– Managed in tables not Control Flow• Easier to manage logical units and reuse

Frameworks

• Standard approach reduces costs– Lower support costs– Higher quality through reuse

• Cost of applying and maintaining framework– Maintainable frameworks are key!– Custom components encapsulate code– Use the API to bulk apply changes

Custom Components

• Task• Pipeline Component– Source– Destination– Transformation

• Log Provider• Connection Manager• For Each Enumerators

Using Custom Components?

• Easy to manage and update– Good for re-use– Can add/edit functionality easily – One file per machine for all packages– Good for frameworks or complex operations– Good debugging and testing support

• Require .NET development skills• External dependency– Additional step during the initial deployment, but also

single update step thereafter

Stock Components & Scripting

• Faster to develop• Familiar and easy to understand– Don’t write your own data flow engine in a script

component!– Acknowledge the need for reuse when it exists and

create a shared external assembly • Self contained - No external dependency• Copy and paste package maintenance– Not good for frameworks or common patterns

Recovery & Restarts

• Checkpoints – Native• CheckpointFileName, SaveCheckpoints,

FailPackageOnFailure

– Task level restart only– Partition your Data Flow• Raw files

– Variable values are persisted– Configurations not refreshed– Event handlers within checkpoint scope

Recovery & Restarts

• Auto-Recovery – Roll your own– Check with IF EXISTS…– Framework workflow• Table of packages with status• Package or task level restart

– Variables and precedence constrains expressions

– Delete and re-load• No change tracking or updates

Wake up – a quick demo…

Anti-Pattern #1

• Send Mail Task for Errors– Control Flow or Event Handlers– Exceptions cause multiple events– Use external host to handle exceptions• SQL Agent

– Use a report

Anti-Pattern # 2

• Multiple Configuration Types– Use one configuration type• Use indirect or ordered pointer configuration

– Easier for support and deployment

Sorting

• Why sort?– Merge Join

– Ordered Analysis• Running totals• Ranking• Script Component• Custom Component • Rank Transform

Sorting

• How to sort?– Sort Transformation– Ordinal Nsort– Ivolva ExtraSort (2005)– Sort in Source – ORDER BY

Sorting

Thank you

• Questions?

• Darren Green– [email protected]– www.konesans.com