denali ssis and data quality enhancements dr greg low principal mentor and ceo solidq australia...

34

Upload: lorena-jackson

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

(c) 2011 Microsoft. All rights reserved.

DENALI SSIS AND DATA QUALITY ENHANCEMENTS

Dr Greg LowPrincipal Mentor and CEOSolidQ Australia

SESSION CODE: DAT307

Who is Greg?

► CEO & Principal Mentor – SolidQ Australia► Host of SQL Down Under Podcast► Microsoft RD and SQL Server MVP, MCM► PASS Regional Mentor► Author

(c) 2011 Microsoft. All rights reserved.

Agenda

► Designer Enhancements► Project System and Parameters► SSIS Server► Data Quality Services

(c) 2011 Microsoft. All rights reserved.

DESIGNER ENHANCEMENTS

Visual Studio Integration

► New WPF based designer► CTP3 -> Visual Studio 2010 shell► Also in Visual Studio 2010 shell – Business Intelligence Semantic Model Designer

(BISM)– Analysis Services -> BISM import template– PowerPivot -> BISM import template

General Usability

► UI redesign and improved visuals► Undo and redo– Selectable levels of undo/redo

► Getting started pane► Zoom control► Pan control (not till CTP3)► Precedence constraint label improvements

Toolbox Enhancements

► Component groupings– Task– Transform

► New icons

► Auto-detection of new toolbox items

Reliability/Development Enhancements

► Many controls now more reliable and substantial reduction in memory usage– Merge– Merge Join

► Particularly applies to situations with– Multiple inputs– Different data arrival rates on each input

► New properties/methods added to allow this same functionality in custom components

Data Flow Enhancements

► Component grouping in Data Flow► Assistants

– Source Assistant– Destination Assistant

► Data Correction Transformation► Easier options for adding/removing data viewers► Edit a component’s properties with input

disconnected– Easier options for correcting input mapping issues

(c) 2011 Microsoft. All rights reserved.

DEMONSTRATIONDesigner Enhancements

(c) 2011 Microsoft. All rights reserved.

PROJECT SYSTEM AND PARAMETERS

Package and File Format Changes

► Old package format was not well-formed XML– Still supported for import– Differencing was difficult

► New package file is much simplified– Appropriate XML structure– Attributes are now XML attributes, not separate nodes– Much more human-readable– Items within groups alphabetical– References are now full paths rather than reference IDs– Readable but discardable layout section

Project System Changes

► Top level object is now a Project– Contains one or more packages– Is deployable along with packages

► Default build now creates an ISPAC file– Contains packages, project, and configurations

► Two project modes– Project mode (new ISPAC behaviour)– Legacy mode (2005/8 behaviour)– Change via right-click option on project in Solution Explorer

► Opening a 2008 project starts upgrade wizard

Project Data Sources and DSVs

► Were not deployed in 2005/8– Design time objects only

► No longer part of a project– Except in legacy mode

Parameters

► Projects now support parameters– Available to all packages within the project– Name and data types– Visible in expression builder and property windows– Like read-only variables in a specific namespace

► Always have default values at design time– Optional – use design time default if not specified– Required – must be supplied when deployed as no default exists

► Environments– Named groups of values assigned to groups of packages– Production, Staging, Test, Development, etc.

Deployment

► Right-click deployment supported from VS► ISDeploymentWizard.exe for non-VS

deployments► Parameters can (or must) be configured at

deployment

(c) 2011 Microsoft. All rights reserved.

DEMONSTRATIONProject System and Parameters

(c) 2011 Microsoft. All rights reserved.

SSIS SERVER

SSIS Server

► No right/wrong decision today on where to deploy to– Filesystem– SQL Server (msdb)

► Integration Services node in SSMS► SSIS now has its own DB (SSISDB)► Packages are now deployed to the SSISDB► Management enhancements

– Remote execution options– Execution dashboard– Improved logging options (can be set at the project level)

SSIS Server Programmability

► SSIS node in SSMS– Catalog not auto-created at setup time– Right-click “Create Catalog” option on SSIS node– SQL CLR integration must be enabled to create catalog

► Public stored procedures (Called SFC-based API)– Deployment– Configuration– Management– Execution

► Example -> packages can be run via– SSMS options– Calling SSISDB.Catalog.Start_Package system stored procedure

Environments

► Hierarchy– Catalog -> Folder -> Project -> Environments

► Environments– Are collections of server variables– Can be mapped to projects, then server

variables are mapped to project or package parameters

– Can be selected at deployment or run time

(c) 2011 Microsoft. All rights reserved.

DEMONSTRATIONSSIS Server

(c) 2011 Microsoft. All rights reserved.

DATA QUALITY SERVICES

It’s all about quality

Is Data Suitable For Business Use?

Data Quality Issue Sample Data Problem

Standard Are data elements consistently defined and understood ?

Gender code = M, F, U in one system and Gender code = 0, 1, 2 in another system

Complete Is all necessary data present ? 20% of customers’ last name is blank, 50% of postcodes are 9999

Accurate Does the data accurately represent reality or a verifiable source?

A Supplier is listed as ‘Active’ but went out of business six years ago

Valid Do data values fall within acceptable ranges?

Salary values should be between 60,000-120,000

Unique Data appears several times Both John Ryan and Jack Ryan appear in the system – are they the same person?

Data Quality Tooling

Cleansing

MatchingProfiling

Monitoring

► Large value market exists for data quality related tooling

Data Quality Services

• Based on a Data Quality Knowledge Base (DQKB)Knowledge-Driven

• Data Domains capture the semantics of your dataKnowledge Discovery

• Acquires additional knowledge the more you use itSemantics

• Support use of user-generated knowledge and IP by 3rd party reference data providersOpen and Extendible

• Compelling user experience designed for increased productivityEasy to use

SSIS Integration with DQS

Knowledge Base

Reference Data Definition

Values/Rules

SSIS Data Flow

Source + Mapping

Data correctionComponent

SSIS Package

Destination

Reference Data Services

DQS Server

• New Records

• Corrections• Suggestions• Correct• Invalid

(c) 2011 Microsoft. All rights reserved.

DEMONSTRATIONData Quality Services

Session Summary

► Strong Enhancements in SSIS for Denali► DQS enhances business value► Start trying it today with CTP3!

Thanks!

[email protected]► www.sqldownunder.com► http://sqlblog.com/blogs/greg_low► www.solidq.com.au

Enrol in Microsoft Virtual Academy TodayWhy Enroll, other than it being free?The MVA helps improve your IT skill set and advance your career with a free, easy to access training portal that allows you to learn at your own pace, focusing on Microsoft technologies.

What Do I get for enrolment?► Free training to make you become the Cloud-Hero in my Organization► Help mastering your Training Path and get the recognition► Connect with other IT Pros and discuss The Cloud

Where do I Enrol?

www.microsoftvirtualacademy.com

Then tell us what you think. [email protected]

(c) 2011 Microsoft. All rights reserved.

© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this

presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.