2 michael mc morrow - print version
TRANSCRIPT
-
8/6/2019 2 Michael Mc Morrow - Print Version
1/18
1
Enriching Data Quality in your Organisation:Enriching Data Quality in your Organisation:Minimising Duplication & Ensuring Data is Reused byMinimising Duplication & Ensuring Data is Reused by
Different Parts of the Business.Different Parts of the Business.
Michael Mc Morrow,
Head of Data Management Services,
Information Management, AIB Bank.
-
8/6/2019 2 Michael Mc Morrow - Print Version
2/18
2
Multiple PerspectivesMultiple Perspectives
DQ Data Lifecyle: Points of Focus
DQ Governance: Top-Down, Bottom-Up DQ Governance: Front to Back
DQ Information Infrastructure: Data Warehouse
DQ Inventory: Central Log DQ Identification: Organisation Culture
DQ Understanding: Metadata
DQ Stakeholders: Varied Expectations DQ Levels: Perfect / Indicative
DQ Assessment: Hard / Soft
DQ Prioritisation: Target Your Critical Data
-
8/6/2019 2 Michael Mc Morrow - Print Version
3/18
3
DQ Data Lifecycle: Points Of FocusDQ Data Lifecycle: Points Of Focus
-
8/6/2019 2 Michael Mc Morrow - Print Version
4/18
4
DQ Data Lifecycle: Points Of Focus: GatherDQ Data Lifecycle: Points Of Focus: Gather
Do the people capturing data
really know what they should enter
(e.g. are categorisations ambiguous?)
and the data quality levels requiredby all subsequent users / usages
(i.e. not just by this data capture application)?
-
8/6/2019 2 Michael Mc Morrow - Print Version
5/18
5
DQ Data Lifecycle: Points Of Focus: ManipulateDQ Data Lifecycle: Points Of Focus: Manipulate
Are all processes which transfer data
reliable,
and rules which transform data
accurate and consistent?
-
8/6/2019 2 Michael Mc Morrow - Print Version
6/186
DQ Data Lifecycle: Points Of Focus: DeliverDQ Data Lifecycle: Points Of Focus: Deliver
Are target outputs
correctly understood
and data mapping to those targetscorrectly performed
(eg. external regulatory reports)?
-
8/6/2019 2 Michael Mc Morrow - Print Version
7/187
DQ Governance: TopDQ Governance: Top--Down, BottomDown, Bottom--UpUp
Top-Down Governance Strategy
Hierarchy of Governance Forums from C-Suite down
Align to the reality of Organisation Structure
Bottom-Up
Each data item governed by a named (Business) Data Steward
Range of Practical Responsibilities e.g.
DQ AssessmentDQ Assessment
DQ Remediation CoDQ Remediation Co--ordinationordination
Metadata ProvisionMetadata Provision
-
8/6/2019 2 Michael Mc Morrow - Print Version
8/188
DQ Governance: TopDQ Governance: Top--Down, BottomDown, Bottom--UpUp
Ref: The Structure of Organisations : A Synthesis of The Research,
Henry Mintzberg
Simple Structure
Machine Bureaucracy
Professional Bureaucracy
Divisionalised Form
Adhocracy
-
8/6/2019 2 Michael Mc Morrow - Print Version
9/189
DQ Governance: Front to BackDQ Governance: Front to Back
Data Steward Responsibility Scope
Cradle to Grave?
DQ of assigned data items from point of capture to all usesDQ of assigned data items from point of capture to all uses
Staged?
DQ of assigned data within a specific application / system layerDQ of assigned data within a specific application / system layer
-
8/6/2019 2 Michael Mc Morrow - Print Version
10/1810
DQ Information Infrastructure : Data WarehouseDQ Information Infrastructure : Data Warehouse
Concept of Single-Version-Of-The-Truth , Information Environment
Certified Quality
Consistent
Reporting
Right Time
Data
Consistent
Performance
Peer-Beating
AnalyticsLimitless
ScalabilityData Content
Rich
Resilient
Availability
User-Friendly
Access
Data History
Rich
Operationally
Aligned
Cost
Effective
Highly
Secure
Optimised
Reporting
Cross-
Functional
-
8/6/2019 2 Michael Mc Morrow - Print Version
11/1811
DQ Inventory: Central LogDQ Inventory: Central Log
Single Inventory of all known DQ Issues
Expose scale of DQ issues
Opportunity to log issues which people just have grown used toOpportunity to log issues which people just have grown used to
Facilitate risk/value-based prioritisation
Identify opportunities to group DQ initiatives
-
8/6/2019 2 Michael Mc Morrow - Print Version
12/1812
DQ Identification : Organisation CultureDQ Identification : Organisation Culture
Responsibility of Everyone
Opportunities Everywhere
Eg. Physical Data Model as DQ Tool
Compare Data Model (Expectation) with Data (Reality)
Anomalies.either Wrong Data Model or Wrong Data
-
8/6/2019 2 Michael Mc Morrow - Print Version
13/1813
DQ Understanding: MetadataDQ Understanding: Metadata Support Safe ReuseSupport Safe Reuse
Technical Static Definition MetadataDefinition of table/column within RDBMS
eg. Character(8), Not Null
Business Static Definition Metadata
Additional Internal/Industry definitions about the table/columneg. Data Steward Id, Business Text Description
Business Static Quality-Status MetadataDocumentation of data quality level once-off or general quality nuances
eg. DQ Issue Log
Business Dynamic Quality-Status MetadataData quality metrics over time
eg. DQ Scorecard Results
Technical Life-Cycle MetadataData flows and transformations on route from source to target
eg. ETL graphs
Technical Relational MetadataHow one item of data relates to other items of data
eg. Physical Data Model
Link
&
Publish
-
8/6/2019 2 Michael Mc Morrow - Print Version
14/1814
DQ Stakeholders: Varied ExpectationsDQ Stakeholders: Varied Expectations
Death Indicator Option within a Data Capture system
What if some staff addWhat if some staff add deceaseddeceased to customer name instead??to customer name instead??
DQ essential to business function owning that system?
DQ essential to some other Regulatory Reporting system?
Fit for Immediate Purpose
Narrow needs of the Data Capture application
Fit for Enterprise Purpose
Wide reuse needs of other stakeholders ( eg. BI/Reporting,Predictive Analytics)
-
8/6/2019 2 Michael Mc Morrow - Print Version
15/1815
DQ Levels: Perfect / IndicativeDQ Levels: Perfect / Indicative
Impractical / Impossible for all data to be perfect
Financial Balances should be perfect
Number of Cattle will only ever be indicative
Define & Assess Appropriate DQ Level per data item
Consider inheriting DQ from external certified sources
Number of Employees of a Client Company
Ask clientAsk client.store internally.store internally.re.re--ask periodically?ask periodically?
Access from some external certified source such as theAccess from some external certified source such as the
Companies Registration Office?Companies Registration Office?
-
8/6/2019 2 Michael Mc Morrow - Print Version
16/1816
DQ Assessment: Hard / SoftDQ Assessment: Hard / Soft
Hard Data Scorecards (Right/Wrong):
Possible to accurately measure breaches of technical rules
Data Format, DataData Format, Data OptionalityOptionality, Data Relationships, Data Relationships Possible to accurately measure breaches of business rules
Mortgage Holder (fact) who is one year old (wrong)Mortgage Holder (fact) who is one year old (wrong)
Soft Data Profiles (Suspicious):
Individual versus Set
Valid for an individual to be born on 01/01/2001Valid for an individual to be born on 01/01/2001 Implausible for half of your customers to be born on 01/01/2001Implausible for half of your customers to be born on 01/01/2001
Trend
NonNon--intuitive pattern over timeintuitive pattern over time
-
8/6/2019 2 Michael Mc Morrow - Print Version
17/1817
DQ Prioritisation: Target Your Critical DataDQ Prioritisation: Target Your Critical Data
Priority Data: Identify Top 100 data items
Apply most complex data scorecarding / profiling effort
Maintain richest metadata
C-Suite visibility of Data Quality issues
All Data: Apply Data Quality Tax to all Change Programs
If opening System X , and there are any logged DQ deficiencies
within System X, then add remediation to program scope
-
8/6/2019 2 Michael Mc Morrow - Print Version
18/1818
Make DQ part of your Organisational DNA
SummarySummary