2 michael mc morrow - print version

Upload: rohanbinshams

Post on 07-Apr-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    1/18

    1

    Enriching Data Quality in your Organisation:Enriching Data Quality in your Organisation:Minimising Duplication & Ensuring Data is Reused byMinimising Duplication & Ensuring Data is Reused by

    Different Parts of the Business.Different Parts of the Business.

    Michael Mc Morrow,

    Head of Data Management Services,

    Information Management, AIB Bank.

    [email protected]

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    2/18

    2

    Multiple PerspectivesMultiple Perspectives

    DQ Data Lifecyle: Points of Focus

    DQ Governance: Top-Down, Bottom-Up DQ Governance: Front to Back

    DQ Information Infrastructure: Data Warehouse

    DQ Inventory: Central Log DQ Identification: Organisation Culture

    DQ Understanding: Metadata

    DQ Stakeholders: Varied Expectations DQ Levels: Perfect / Indicative

    DQ Assessment: Hard / Soft

    DQ Prioritisation: Target Your Critical Data

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    3/18

    3

    DQ Data Lifecycle: Points Of FocusDQ Data Lifecycle: Points Of Focus

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    4/18

    4

    DQ Data Lifecycle: Points Of Focus: GatherDQ Data Lifecycle: Points Of Focus: Gather

    Do the people capturing data

    really know what they should enter

    (e.g. are categorisations ambiguous?)

    and the data quality levels requiredby all subsequent users / usages

    (i.e. not just by this data capture application)?

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    5/18

    5

    DQ Data Lifecycle: Points Of Focus: ManipulateDQ Data Lifecycle: Points Of Focus: Manipulate

    Are all processes which transfer data

    reliable,

    and rules which transform data

    accurate and consistent?

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    6/186

    DQ Data Lifecycle: Points Of Focus: DeliverDQ Data Lifecycle: Points Of Focus: Deliver

    Are target outputs

    correctly understood

    and data mapping to those targetscorrectly performed

    (eg. external regulatory reports)?

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    7/187

    DQ Governance: TopDQ Governance: Top--Down, BottomDown, Bottom--UpUp

    Top-Down Governance Strategy

    Hierarchy of Governance Forums from C-Suite down

    Align to the reality of Organisation Structure

    Bottom-Up

    Each data item governed by a named (Business) Data Steward

    Range of Practical Responsibilities e.g.

    DQ AssessmentDQ Assessment

    DQ Remediation CoDQ Remediation Co--ordinationordination

    Metadata ProvisionMetadata Provision

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    8/188

    DQ Governance: TopDQ Governance: Top--Down, BottomDown, Bottom--UpUp

    Ref: The Structure of Organisations : A Synthesis of The Research,

    Henry Mintzberg

    Simple Structure

    Machine Bureaucracy

    Professional Bureaucracy

    Divisionalised Form

    Adhocracy

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    9/189

    DQ Governance: Front to BackDQ Governance: Front to Back

    Data Steward Responsibility Scope

    Cradle to Grave?

    DQ of assigned data items from point of capture to all usesDQ of assigned data items from point of capture to all uses

    Staged?

    DQ of assigned data within a specific application / system layerDQ of assigned data within a specific application / system layer

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    10/1810

    DQ Information Infrastructure : Data WarehouseDQ Information Infrastructure : Data Warehouse

    Concept of Single-Version-Of-The-Truth , Information Environment

    Certified Quality

    Consistent

    Reporting

    Right Time

    Data

    Consistent

    Performance

    Peer-Beating

    AnalyticsLimitless

    ScalabilityData Content

    Rich

    Resilient

    Availability

    User-Friendly

    Access

    Data History

    Rich

    Operationally

    Aligned

    Cost

    Effective

    Highly

    Secure

    Optimised

    Reporting

    Cross-

    Functional

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    11/1811

    DQ Inventory: Central LogDQ Inventory: Central Log

    Single Inventory of all known DQ Issues

    Expose scale of DQ issues

    Opportunity to log issues which people just have grown used toOpportunity to log issues which people just have grown used to

    Facilitate risk/value-based prioritisation

    Identify opportunities to group DQ initiatives

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    12/1812

    DQ Identification : Organisation CultureDQ Identification : Organisation Culture

    Responsibility of Everyone

    Opportunities Everywhere

    Eg. Physical Data Model as DQ Tool

    Compare Data Model (Expectation) with Data (Reality)

    Anomalies.either Wrong Data Model or Wrong Data

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    13/1813

    DQ Understanding: MetadataDQ Understanding: Metadata Support Safe ReuseSupport Safe Reuse

    Technical Static Definition MetadataDefinition of table/column within RDBMS

    eg. Character(8), Not Null

    Business Static Definition Metadata

    Additional Internal/Industry definitions about the table/columneg. Data Steward Id, Business Text Description

    Business Static Quality-Status MetadataDocumentation of data quality level once-off or general quality nuances

    eg. DQ Issue Log

    Business Dynamic Quality-Status MetadataData quality metrics over time

    eg. DQ Scorecard Results

    Technical Life-Cycle MetadataData flows and transformations on route from source to target

    eg. ETL graphs

    Technical Relational MetadataHow one item of data relates to other items of data

    eg. Physical Data Model

    Link

    &

    Publish

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    14/1814

    DQ Stakeholders: Varied ExpectationsDQ Stakeholders: Varied Expectations

    Death Indicator Option within a Data Capture system

    What if some staff addWhat if some staff add deceaseddeceased to customer name instead??to customer name instead??

    DQ essential to business function owning that system?

    DQ essential to some other Regulatory Reporting system?

    Fit for Immediate Purpose

    Narrow needs of the Data Capture application

    Fit for Enterprise Purpose

    Wide reuse needs of other stakeholders ( eg. BI/Reporting,Predictive Analytics)

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    15/1815

    DQ Levels: Perfect / IndicativeDQ Levels: Perfect / Indicative

    Impractical / Impossible for all data to be perfect

    Financial Balances should be perfect

    Number of Cattle will only ever be indicative

    Define & Assess Appropriate DQ Level per data item

    Consider inheriting DQ from external certified sources

    Number of Employees of a Client Company

    Ask clientAsk client.store internally.store internally.re.re--ask periodically?ask periodically?

    Access from some external certified source such as theAccess from some external certified source such as the

    Companies Registration Office?Companies Registration Office?

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    16/1816

    DQ Assessment: Hard / SoftDQ Assessment: Hard / Soft

    Hard Data Scorecards (Right/Wrong):

    Possible to accurately measure breaches of technical rules

    Data Format, DataData Format, Data OptionalityOptionality, Data Relationships, Data Relationships Possible to accurately measure breaches of business rules

    Mortgage Holder (fact) who is one year old (wrong)Mortgage Holder (fact) who is one year old (wrong)

    Soft Data Profiles (Suspicious):

    Individual versus Set

    Valid for an individual to be born on 01/01/2001Valid for an individual to be born on 01/01/2001 Implausible for half of your customers to be born on 01/01/2001Implausible for half of your customers to be born on 01/01/2001

    Trend

    NonNon--intuitive pattern over timeintuitive pattern over time

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    17/1817

    DQ Prioritisation: Target Your Critical DataDQ Prioritisation: Target Your Critical Data

    Priority Data: Identify Top 100 data items

    Apply most complex data scorecarding / profiling effort

    Maintain richest metadata

    C-Suite visibility of Data Quality issues

    All Data: Apply Data Quality Tax to all Change Programs

    If opening System X , and there are any logged DQ deficiencies

    within System X, then add remediation to program scope

  • 8/6/2019 2 Michael Mc Morrow - Print Version

    18/1818

    Make DQ part of your Organisational DNA

    SummarySummary