business semantics for data governance and stewardship

42
Business Semantics For Data Governance & Stewardship Dr. Pieter De Leenheer Sloan Hall Stanford University Feb 4 - 2015

Upload: pieter-de-leenheer

Post on 17-Jul-2015

576 views

Category:

Data & Analytics


6 download

TRANSCRIPT

Page 1: Business Semantics for Data Governance and Stewardship

Business SemanticsFor Data Governance & Stewardship

Dr. Pieter De Leenheer

Sloan HallStanford University

Feb 4 - 2015

Page 2: Business Semantics for Data Governance and Stewardship

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Page 3: Business Semantics for Data Governance and Stewardship

La Trahison des Images (Magritte, 1929)

Page 4: Business Semantics for Data Governance and Stewardship

La Trahison des Images (2)

https://deleenheer.wordpress.com/2009/12/15/magrittes-flirting-with-semantics/

Page 5: Business Semantics for Data Governance and Stewardship

What we talk about when we talk about

no Data Governance

Who approved this?

I wish these guys

spoke our

language

I can’t understand

this report !

I’ve never seen this

code! Who

introduced this ?

This doesn’t seem

right. Are we sure

this data is correct ?

The Problem

This rule is

different in our

country !

This is an exception

to the rule !

Page 6: Business Semantics for Data Governance and Stewardship

Glossary Search

• How frequently do you look up a word for your business?

• To what purpose?– Clarification– Differentiation

• What are your main sources?• Hierarchy-based navigation or key-word based

search?• Authoritative Truth or trust?

Page 7: Business Semantics for Data Governance and Stewardship

From Truth to Trust: Behind the Curtains

https://www.research.ibm.com/visual/projects/history_flow/results.htm

Page 8: Business Semantics for Data Governance and Stewardship

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Page 9: Business Semantics for Data Governance and Stewardship

Spectrum of Business Semantics

Welty, C., Lehmann, F., Gruninger, G., and Uschold, M. (1999). Ontology: Expert systems all over again? In Invited panel at AAAI-99: The National Conference on Artificial Intelligence, Austin, Texas, USA.

Page 10: Business Semantics for Data Governance and Stewardship

The Big ‘Metadata’ BangCatalogue and text files

• The start of an organization’s data management

• Represented by shared folders with lists of things such as product, customer, templates

• First ‘clouds’ of metadata– Naturally emerge as by-product

– For human consumption

– Locally understood

• From this point exponential expansion:

• in volume• in consumers (receiver)• in producers (sender)• in entropy

Page 11: Business Semantics for Data Governance and Stewardship

Glossary• List of terms and definitions

e.g., http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/data-governance-and-stewardship-materials/

Page 12: Business Semantics for Data Governance and Stewardship

Thesaurus

• add homo-, syno, mero-, hyper- and hyponymous relations

Page 13: Business Semantics for Data Governance and Stewardship

Taxonomy

• Formalized representation of a “thesaurus”• Generalize and specialize properties and relations

– generalize Vendor and Customer with similar properties into Party

– specialize Location into Home Address and Office Addressbecause of different properties

• Classifying a thing as a Term, Data Element or System– E.g., “customer” vs. “CUST_TBL” vs. “CRM” to determine

ownership

• Inheritance-based reasoning such as syllogisms– Premise: “John doe” is a lead– Premise: All leads receive a mortgage offering– Conclusion : “John Doe” receives a mortgage offering

Page 14: Business Semantics for Data Governance and Stewardship

Frames

Page 15: Business Semantics for Data Governance and Stewardship

Logical constraints

• Modal Logic:

– context determines meaning, truthfulness, validity

– plausibility vs. necessity

• Modalities determine:

– who owns a term per region, process, function

– where and how enforce terms

– What the definition is of a term

Page 16: Business Semantics for Data Governance and Stewardship

Hierarchical Context in ACORD

Page 17: Business Semantics for Data Governance and Stewardship

Multidimensional Context

Page 18: Business Semantics for Data Governance and Stewardship

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Page 19: Business Semantics for Data Governance and Stewardship

Situating an organization’s level of glossary need

size characterizing events business needs technology support status

1 to 50first term-and-condition templates, first products, customers

a catalogue of items like customers, products and offerings spreadsheet database

51 to 100

first customer segmentationlead engine setupbusiness functions defined

as the catalogues grow in size, transform loose descriptions and definitions in text files into a glossaryof terms

shared file folders (for lead, prospect, customer, product, offering)

101 to 500

business functions populatedinter-functional business processes developproduct and customer data volumes grow

the need for a thesaurus for comparing glossaries, differentation of customer types, pricing models, reporting templateslocal data analytics and storage

Spreadsheet, mediawiki, functional processes like salesforce, SDLC, servicenow; forecasting tools, reporting tools, databases

501 to 1000

invested growthmergers and acquisition take placefirst signs of corrupt data reports on the board table

the need to transforming thesauri into taxonomiesand data models and architecture framesISO/ACORD/BCBS standardization

mediawikis go viral without proper alignemnt between them; first metadata tools in IT to align certain functions, business limited to spreadsheets

1001 plus

global operationsone or more red flaggs: legal (regulatory compliance breached): organizational (CxO fired), bad reputation (fraud), financial loss (penalties, debt)

Reporting standards transformed into corporate data policies and rules and data qualityModalities as to who are to define them and how and where to enforce them have been setThe need for the CDO function is mentioned but resistance from CIO/CTOBig Data opportunities loom beyond the data nebula (screen with universe).

platform with several data management systems (infa, ibm, oracle) scared by M&A. Lineage fragmented, not properly validated by businessdata governance organization theorized (or failed before) so no one takes accountability, lack of functional descriptions or enterprise-wide championshipGlossaries’ usefulness implodes as their numbers increaseThe enterprise data model is common ground for IT but useless to the business. Validation is urgent.

Page 20: Business Semantics for Data Governance and Stewardship

Overview

• ICT: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Page 21: Business Semantics for Data Governance and Stewardship

Principles of Business Semantics

• Democracy

• Emergence

• Perspective rendering

• Perspective unification

• Validation

http://www.academia.edu/874733/Business_semantics_management_A_case_study_for_competency-centric_HRM

Page 22: Business Semantics for Data Governance and Stewardship

Principles at work in the Situation Map

• Emergence is a continuous principle at work• Unification and rendering continuous in flux but

at two different frequencies (B vs. IT)• Validation is limited to technical lineage• Democracy and Business Validation (socio-

technical) are lacking

• Reactive rather than pro-active governance (defining) and stewardship (enforcing)

• Lack of tools

Page 23: Business Semantics for Data Governance and Stewardship

Overview

• Communication: from Truth to Trust

• The Spectrum of Business Semantics

• Situation Map

• Business Semantics Governance & Stewardship

– Principles

– Operating Framework

• Reflection and Questions

Page 24: Business Semantics for Data Governance and Stewardship

Gradually Build Trust based on Stewardship and Validation

• What?

– Qualitative meta data: e.g., definition for

address, codes, mappings, classifications, etc.

• Who?

– Roles and responsibilities for people

• How ?

– Collaborative workflows to orchestrate

people in achieving high-quality meta-data

– Start Simple, Buy-in, Council

– Measure Maturity and Trust

– Separate stewardship from integration

Data Governance Council: Governance Operating Model

Roles &

Responsibilities

Processes &

Workflow

Asset Types &

Traceability

Data Governance

Organization

Data Stewardship Activities

Data Quality

Development

IT / Operational Data Management Activities

Data

Modeling

Metadata

Lineage

Establishes & drives

Aligns & Coordinates

Reports & Escalates

Monitors & Remediates

Metadata

Scanning

Reference Data

Authoring

Data

Integration

Collibra Business

Semantics Glossary (BSG)

Collibra Reference Data

Accelerator (RDA)

Hierarchy

Management

Business &

Data Definitions

Business

Traceability

Semantic

Modeling

Mapping

Specifications

Policy

Management

Business

Rules

Data Quality

Rules

Data Quality

Reporting

Issue

Management

Reference Data

Crosswalks

Master Data

StewardshipData Quality Profiling

DQ Defect

Resolution

Collibra Data Stewardship

Manager (DSM)

Collibra Platform

Other Data Management

Vendor products

...

Page 25: Business Semantics for Data Governance and Stewardship

Example in Health Insurancehttp://prezi.com/ve1ws8jmpqcn/workflow/

Page 26: Business Semantics for Data Governance and Stewardship

Global Data Governance

• Objective– n Enterprise service buses => 1 Global Information Market Place

• Challenges – Data Service = data sharing agreement across organization silos, policies,

regulations, semantic assumptions. E.g., Address

– No clear balance between data ownership and control:

• responsibilities are not set

• for each data point : increasing exposure to risk regarding quality and policy compliance

• Service is more about trust because truth is relative

Page 27: Business Semantics for Data Governance and Stewardship

Solution

Page 28: Business Semantics for Data Governance and Stewardship

Solution

One Global Information Hub

Page 29: Business Semantics for Data Governance and Stewardship

Solution Phase 1 : Jun-Sept

One Global Information Hub

Page 30: Business Semantics for Data Governance and Stewardship

Solution Phase 2 : Oct-Nov

One Global Information Hub

Page 31: Business Semantics for Data Governance and Stewardship

Solution Phase 2 : Oct-Nov

One Global Information Hub

Page 32: Business Semantics for Data Governance and Stewardship

Solution Phase 3 : Dec -

One Global Information Hub

Page 33: Business Semantics for Data Governance and Stewardship

Solution

One Global Information Hub

Page 34: Business Semantics for Data Governance and Stewardship

What is to be governed?

Data Governance Questions

• What does the term ”address” mean?

• How is term “address" represented?

• In what system are data elements on ”address” recorded?

• What views does a data sharing agreement include?

• To which policy does my data sharing agreement comply?

• What country is my term “address” classified?

• …

Collibra Traceability Paths

Term has attributes definition, description, etc.

Term is represented by Data Element

Data Element has system of record System

Data sharing Agreement groups Data View

Business Term ≠

Data Elementhttps://compass.collibra.com/display/COOK/Asset+Types+and+Traceability+Requirements

Page 35: Business Semantics for Data Governance and Stewardship

Operating Model

Page 36: Business Semantics for Data Governance and Stewardship

Traceability Diagram

Page 37: Business Semantics for Data Governance and Stewardship

Who? RACI

Page 38: Business Semantics for Data Governance and Stewardship

How is it to be governed?

• Status Types and Workflows

– For Domains, Terms, Users, and later for Issues and Data Sharing

Agreements

BUSINESS SEMANTICS GLOSSARY

Candidate In Progress

Under Review

Accepted In Revision

Rejected

Term requested on

the domain page 1 1

1

2

2

3

3

2

3

Depricated

4

5

Workflows

1

2

Propose Business Term

Edit Business Term

3 Onboarding Business Term

4 Deprecate Business Term

5 Reactivate Business Term

https://compass.collibra.com/display/COOK/Lifecycle%3A+Workflows+and+Status+Types

Page 39: Business Semantics for Data Governance and Stewardship

How it it to be governed? Propose Workflow

Page 40: Business Semantics for Data Governance and Stewardship

How it it to be governed? Onboarding Workflow

Page 41: Business Semantics for Data Governance and Stewardship

How it it to be governed? Approval Workflow

Page 42: Business Semantics for Data Governance and Stewardship

Questions for the Audience

We presume the starting point is glossary.

• What factors would make it impossible?

• Know of cases where it has been achieved without?

• Is it possible to establish data governance without a glossary?