Transcript
Page 1: How well do you know your DATA?

Copyright 2007, Information Builders. Slide 1

How well do you know your DATA?

Glenn Wiebe

May 15, 2012

Page 2: How well do you know your DATA?

Is Data Liability?

$$$ for Data Storage $$$ for Data Backups $$$ for Data Archiving $$$ for Data Replication $$$ for Data Synchronization $$$ for Disaster Recovery Planning

Page 3: How well do you know your DATA?

Is Data Asset?

Helps in making decisions Provides 360 degree view across the enterprise Helps to understand the customer Helps in building effective Marketing Campaigns Predictive Analysis Statistical Analysis Sentimental Analysis

Page 4: How well do you know your DATA?

Data Governance Program

People Organizations need

executive sponsorship

Process Documented repeatable

processes and procedures

Technology Data Integration, Data

Quality, Data Synchronization, and Data Management

Data Governance

People

ProcessTechnology

Page 5: How well do you know your DATA?

iWay Data Integration Enablement

SFA/CRM Amdocs/Clarify BMC/Remedy MSDynamics Oracle/Siebel Salesforce.com SAP

Data Warehouse DB2 ETL Oracle/Essbase MS SSAS/OLAP Netezza SAP BW Teradata

B2B Internet EDI Legacy EDI MFT Online B2B XML

ERP/Financials Ariba I2 JD Edwards Lawson Manugistics Microsoft Oracle SAP

Industry HIPAA CIDX HL7 RNIF SWIFT 1Sync

Legacy Systems CICS IMS VSAM .NET Java TUXEDO etc

300+Adapters

Page 6: How well do you know your DATA?

Data Profiling Statistical Analysis

An overview of summary values, such as extremes, distribution and frequency analysis.

Domain Analysis A configurable analysis of data types.

Mask and Group Analysis An overview of value formats, groups and

dimensions. Business Rules

An analysis of the results of user-defined business rules.

Foreign Key and Dependency Analyses An inside look into complex connections in the

data. Drill Through

The option to display individual records that correspond to aggregated results.

Data Mart Reporting and analysis across multiple data set

analyses Web and/or hardcopy report viewing and

distribution

Page 7: How well do you know your DATA?

Data Quality Management Cycle

Parsing

Association(householding)

Formatcorrection

Issues causesidentification

Contentevaluation

Metadataunderstanding

Automaticcorrection

Profiling

Context-basedcleansing

Devianceidentification

Standardization

Ongoingmonitoring

Enrichment

KPIdefinition

Unification

Deduplication/ identification

Data understandingMonitoring and reporting

Data enhancement Data cleansing

Page 8: How well do you know your DATA?

iWay Data Quality Center

Parsing: Decomposition of fieldsinto component parts.

Cleansing: Modification of data valuesto meet domain restrictions, integrity constraintsor other business rules that define sufficientdata quality for the organization.

Standardization: Formatting of values into consistent layouts based on industry standards, local standards, user-defined business rules and knowledge bases of values and patterns.

Validation: Formatting of values into consistent layouts based on industry standards, local standards, user-defined business rules and knowledge bases of values and patterns.

Enrichment: Enhancing the value of internally held data by appending related attributes from external sources.

Matching: Identification, linking or merging related entries within or across sets of data.

Page 9: How well do you know your DATA?

Mastering Master Data

What is Master Data? Data describing your main business entities Data duplicated in multiple systems Data reused by multiple business processes

Examples Customer/Citizen/Patient Company/Partner/Agency Products/Items/Equipment Vendors/Suppliers Cost Centers/Employees Etc, etc, …

Page 10: How well do you know your DATA?

Master Data – Match & Merge

Unification identification of the set of records connected to one

person address vehicle contact …etc.

Deduplication golden record creation (the best representation of the identified subject)

Identification new data entries – to identify subject (person, address, etc.) to which the new record is

connected (matched)

Complex business rules using sophisticated algorithms and functions including

Levenstein distance Hamming distance Edit distance Data quality scores values Data stamps of last modification Source system originating data etc.

Page 11: How well do you know your DATA?

Data Quality Portal - Complex Exception Handling

Exception DB

ResolutionQueue

DQplan

KPI / DQIcalculation

Portal

Invalid dataextraction

Reports

Resolution queue

Workflow

Exceptionmanagement

Page 12: How well do you know your DATA?

Human Mind vs. Computer Systems

Hahaha raed tihs! i cdnuolt blveiee taht I cluod aulaclty

uesdnatnrd waht I was rdanieg. The phaonemnel pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it dseno't mtaetr in waht oerdr the ltteres in a wrod are, the olny iproamtnt tihng is taht the frsit and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it whotuit a pboerlm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Azanmig huh?

Page 13: How well do you know your DATA?

Original data – before cleansing

Source data

Name G SIN Birth Date AddressDr. John Smith M 000000000 12/16/1978 14618 110 Ave Surrey V3R 2A9

Smtih W. John M 095-242-434 16.12.1978 Surrey 14618 110 Ave

Jhon William Simth SIN095242434 781612 25 Linden Str Toronto M4X 1V5

Dr. J.W. Smith M 095242433 11/16/78

John Smith 095252433 16.11.1978 8500 Leslie L3T 7M8 Toronto

Smith Jhon 16.11.1978 8500 Leslie street Marham

John Smiht 095252433 16.11.1978

Page 14: How well do you know your DATA?

Prepared data (after cleansing)

Cleansed data

First Last G SIN Birth Date AddressJohn Smith M 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smtih M 095242434 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

Jhon Simth M 095242434 M4X 1V5;ON;Toronto;25 Linden Street

Smith M 1978-11-16

John Smith M 095252433 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

Jhon Smith M 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

John Smiht 095252433 1978-11-16

Page 15: How well do you know your DATA?

Match

Cleansed data

First Last G SIN Birth Date AddressJohn Smith M 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smtih M 095242434 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

Jhon Smith M 095242434 M4X 1V5;ON;Toronto;25 Linden Street

Smith M 1978-11-16

John Smith M 095252433 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

Jhon Smith M 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

John Smiht 095252433 1978-11-16

Page 16: How well do you know your DATA?

Merge

Cleansed data

First Last G SIN Birth Date AddressJohn Smith M 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smtih M 095242434 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

Jhon Smith M 095242434 M4X 1V5;ON;Toronto;25 Linden Street

Golden recordFirst Last G SIN Birth Date Address

John Smith M

095242434 1978-12-16

M4X 1V5;ON;Toronto;25 Linden Street

The newest permanent address

The most frequent address

V3R 2A9;BC;Surrey;14618 110 Avenue

Page 17: How well do you know your DATA?

Merged records – before update

Source data

First Last G SIN Birth Date AddressJohn Smith M 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smith M 095242434 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smith M 095242434 M4X 1V5;ON;Toronto;25 Linden Street

John Smith M 095252433 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

John Smith M 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

John Smiht 095252433 1978-11-16

Golden recordFirst Last G SIN Birth Date Address

John Smith M 095242434 1978-12-16 M4X 1V5;ON;Toronto;25 Linden Street

John Smith M 095252433 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

Page 18: How well do you know your DATA?

Merged records – after update

Source data

First Last G SIN Birth Date AddressJohn Smith M 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smith M 095242434 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smith M 095252433 M4X 1V5;ON;Toronto;25 Linden Street

John Smith M 095252433 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

John Smith M 1978-11-16 L3T 7M8;ON;Markham;8500 Leslie Str.

John Smiht 095252433 1978-11-16

Golden recordFirst Last G SIN Birth Date Address

John Smith M 095242434 1978-12-16 V3R 2A9;BC;Surrey;14618 110 Avenue

John Smith M 095252433 1978-11-16 M4X 1V5;ON;Toronto;25 Linden Street

One updated source recordmay cause modification in several records in MDC

Page 19: How well do you know your DATA?

Real World Use Case

The Goal Major hospital group is building a Master Patient Index Need to bring in acquisitioned systems Cleanse, Standard, DeduplicateThe Challenge Previously manually processed by hiring temporary staff Current phase projected to take temporary staff of 20 over 18 monthsThe Strategy Automate the cleansing, matching and merging business rules Data Stewardship provides human oversight to automated processThe Benefits Identifies the duplicate records according to very complex business rules Reusable rules for future phases Significantly reduced project time – from 18 down to 4 months. Over 400% ROI projected

Page 20: How well do you know your DATA?

Real World Use Case

Goal Performance Management Business Intelligence Change Management Process

The Challenge 100 Locations 14 Systems with out-of-sync master data

The Strategy Cleanse, Standardize, Match Master Data Management – Directorate, Borough, Site, Service Type, Service

Point, Team, Staff, Patient Master Data Governance Workflow

The Benefits Dynamic organizational change to support strategic initiatives Complete visibility into performance of organization vs goals

Page 21: How well do you know your DATA?

Real World Use Case

The Goal Services organization supporting the airline industry sells decision support information to

the industry members.

The Challenge Data Quality was adversely affecting the customer base satisfaction Data Quality was impacting new revenue generation opportunities

The Strategy Profile analysis according to specific business validation rules Monitor rolling 13 month window comparison of monthly data profiles Accumulate and report analysis to data providers

The Benefits Improves customer satisfaction and confidence in the information Increases reliability of the information as new data sources are added Documents and audits quality-control processes for customer review Reduces the dependency on human resources to detect and correct data quality issues

Page 22: How well do you know your DATA?

Summary of considerations

Access to variety of data sources Ability to influence data improvement anywhere in the

process Useable in batch and/or (real) real-time processing mode Extensible by customized business rules Access to third party data and services Historical and distributable analysis Reusability across multiple phases and projects Integrated data stewardship Platform flexibility for deployment and licensing Vendor partnership and support

Copyright 2007, Information Builders. Slide 22

InformationAccess

DataQuality

MasterData

Management

DataGovernance

Page 23: How well do you know your DATA?

iWay Software Benefits

Integrate All InformationAny Data

Any SystemAny ProtocolAny Platform

Any Process LatencyScheduled

Process DrivenEvent DrivenUser Driven

Real-time, Online, and BatchData Integration

Application IntegrationBusiness Integration

Service Oriented Architecture

Single Solution PlatformSingle Engine

Fast and ScalableSecure and Reliable

Fully Extensible

Page 24: How well do you know your DATA?

Questions?


Top Related