denali ssis and data quality enhancements dr greg low principal mentor and ceo solidq australia...
TRANSCRIPT
(c) 2011 Microsoft. All rights reserved.
DENALI SSIS AND DATA QUALITY ENHANCEMENTS
Dr Greg LowPrincipal Mentor and CEOSolidQ Australia
SESSION CODE: DAT307
Who is Greg?
► CEO & Principal Mentor – SolidQ Australia► Host of SQL Down Under Podcast► Microsoft RD and SQL Server MVP, MCM► PASS Regional Mentor► Author
(c) 2011 Microsoft. All rights reserved.
Agenda
► Designer Enhancements► Project System and Parameters► SSIS Server► Data Quality Services
Visual Studio Integration
► New WPF based designer► CTP3 -> Visual Studio 2010 shell► Also in Visual Studio 2010 shell – Business Intelligence Semantic Model Designer
(BISM)– Analysis Services -> BISM import template– PowerPivot -> BISM import template
General Usability
► UI redesign and improved visuals► Undo and redo– Selectable levels of undo/redo
► Getting started pane► Zoom control► Pan control (not till CTP3)► Precedence constraint label improvements
Toolbox Enhancements
► Component groupings– Task– Transform
► New icons
► Auto-detection of new toolbox items
Reliability/Development Enhancements
► Many controls now more reliable and substantial reduction in memory usage– Merge– Merge Join
► Particularly applies to situations with– Multiple inputs– Different data arrival rates on each input
► New properties/methods added to allow this same functionality in custom components
Data Flow Enhancements
► Component grouping in Data Flow► Assistants
– Source Assistant– Destination Assistant
► Data Correction Transformation► Easier options for adding/removing data viewers► Edit a component’s properties with input
disconnected– Easier options for correcting input mapping issues
Package and File Format Changes
► Old package format was not well-formed XML– Still supported for import– Differencing was difficult
► New package file is much simplified– Appropriate XML structure– Attributes are now XML attributes, not separate nodes– Much more human-readable– Items within groups alphabetical– References are now full paths rather than reference IDs– Readable but discardable layout section
Project System Changes
► Top level object is now a Project– Contains one or more packages– Is deployable along with packages
► Default build now creates an ISPAC file– Contains packages, project, and configurations
► Two project modes– Project mode (new ISPAC behaviour)– Legacy mode (2005/8 behaviour)– Change via right-click option on project in Solution Explorer
► Opening a 2008 project starts upgrade wizard
Project Data Sources and DSVs
► Were not deployed in 2005/8– Design time objects only
► No longer part of a project– Except in legacy mode
Parameters
► Projects now support parameters– Available to all packages within the project– Name and data types– Visible in expression builder and property windows– Like read-only variables in a specific namespace
► Always have default values at design time– Optional – use design time default if not specified– Required – must be supplied when deployed as no default exists
► Environments– Named groups of values assigned to groups of packages– Production, Staging, Test, Development, etc.
Deployment
► Right-click deployment supported from VS► ISDeploymentWizard.exe for non-VS
deployments► Parameters can (or must) be configured at
deployment
SSIS Server
► No right/wrong decision today on where to deploy to– Filesystem– SQL Server (msdb)
► Integration Services node in SSMS► SSIS now has its own DB (SSISDB)► Packages are now deployed to the SSISDB► Management enhancements
– Remote execution options– Execution dashboard– Improved logging options (can be set at the project level)
SSIS Server Programmability
► SSIS node in SSMS– Catalog not auto-created at setup time– Right-click “Create Catalog” option on SSIS node– SQL CLR integration must be enabled to create catalog
► Public stored procedures (Called SFC-based API)– Deployment– Configuration– Management– Execution
► Example -> packages can be run via– SSMS options– Calling SSISDB.Catalog.Start_Package system stored procedure
Environments
► Hierarchy– Catalog -> Folder -> Project -> Environments
► Environments– Are collections of server variables– Can be mapped to projects, then server
variables are mapped to project or package parameters
– Can be selected at deployment or run time
Is Data Suitable For Business Use?
Data Quality Issue Sample Data Problem
Standard Are data elements consistently defined and understood ?
Gender code = M, F, U in one system and Gender code = 0, 1, 2 in another system
Complete Is all necessary data present ? 20% of customers’ last name is blank, 50% of postcodes are 9999
Accurate Does the data accurately represent reality or a verifiable source?
A Supplier is listed as ‘Active’ but went out of business six years ago
Valid Do data values fall within acceptable ranges?
Salary values should be between 60,000-120,000
Unique Data appears several times Both John Ryan and Jack Ryan appear in the system – are they the same person?
Data Quality Tooling
Cleansing
MatchingProfiling
Monitoring
► Large value market exists for data quality related tooling
Data Quality Services
• Based on a Data Quality Knowledge Base (DQKB)Knowledge-Driven
• Data Domains capture the semantics of your dataKnowledge Discovery
• Acquires additional knowledge the more you use itSemantics
• Support use of user-generated knowledge and IP by 3rd party reference data providersOpen and Extendible
• Compelling user experience designed for increased productivityEasy to use
SSIS Integration with DQS
Knowledge Base
Reference Data Definition
Values/Rules
SSIS Data Flow
Source + Mapping
Data correctionComponent
SSIS Package
Destination
Reference Data Services
DQS Server
• New Records
• Corrections• Suggestions• Correct• Invalid
Session Summary
► Strong Enhancements in SSIS for Denali► DQS enhances business value► Start trying it today with CTP3!
Thanks!
► [email protected]► www.sqldownunder.com► http://sqlblog.com/blogs/greg_low► www.solidq.com.au
Enrol in Microsoft Virtual Academy TodayWhy Enroll, other than it being free?The MVA helps improve your IT skill set and advance your career with a free, easy to access training portal that allows you to learn at your own pace, focusing on Microsoft technologies.
What Do I get for enrolment?► Free training to make you become the Cloud-Hero in my Organization► Help mastering your Training Path and get the recognition► Connect with other IT Pros and discuss The Cloud
Where do I Enrol?
www.microsoftvirtualacademy.com
Then tell us what you think. [email protected]
(c) 2011 Microsoft. All rights reserved.
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this
presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.