metadata strategies

54
Peter Aiken, Ph.D. Metadata Management Strategies 10124 W. Broad Street, Suite C Glen Allen, Virginia 23060 804.521.4056 Metadata Management Strategies Copyright 2016 by Data Blueprint Good systems development often depends on multiple data management disciplines to provide a solid foundation. One of these is metadata. While much of the discussion focuses on understanding metadata itself along with its associated technologies, this perspective often represents a typical tool-and-technology focus, which has not achieved significant results to date. A more relevant question when considering pockets of metadata is whether to include them in the scope of organizational metadata practices. By understanding what it means to include items in the scope of your metadata practices, you can begin to build systems that allow you to practice sophisticated ways to advance data management and supported business initiatives with a demonstrable ROI. After a bit of practice in this manner you can position your organization to better exploit any and all metadata technologies in support of business strategy Date: May 10, 2016 Time: 2:00 PM ET/11:00 AM PT Presenter: Peter Aiken, Ph.D. Data Where Why What How Who When Data

Upload: dataversity

Post on 13-Jan-2017

1.001 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Metadata Strategies

Peter Aiken, Ph.D.

Metadata Management Strategies

10124 W. Broad Street, Suite C Glen Allen, Virginia 23060

804.521.4056

Metadata Management Strategies

Copyright 2016 by Data Blueprint

Good systems development often depends on multiple data management disciplines to provide a solid foundation. One of these is metadata. While much of the discussion focuses on understanding metadata itself along with its associated technologies, this perspective often represents a typical tool-and-technology focus, which has not achieved significant results to date. A more relevant question when considering pockets of metadata is whether to include them in the scope of organizational metadata practices. By understanding what it means to include items in the scope of your metadata practices, you can begin to build systems that allow you to practice sophisticated ways to advance data management and supported business initiatives with a demonstrable ROI. After a bit of practice in this manner you can position your organization to better exploit any and all metadata technologies in support of business strategy

Date: May 10, 2016

Time: 2:00 PM ET/11:00 AM PT Presenter: Peter Aiken, Ph.D.

Data

WhereWhy

What How

Who

When

Data

Page 2: Metadata Strategies

Executive Editor at DATAVERSITY.net

3Copyright 2016 by Data Blueprint Slide #

Shannon Kempe

Commonly Asked Questions

4Copyright 2016 by Data Blueprint Slide #

1) Will I get copies of the slides after the event?

2) Is this being recorded?

Page 3: Metadata Strategies

Get Social With Us!

5Copyright 2016 by Data Blueprint Slide #

Like Us on Facebook www.facebook.com/

datablueprint Post questions and

comments

Find industry news, insightful content

and event updates.

Join the Group Data Management &

Business Intelligence Ask questions, gain insights and collaborate with fellow

data management professionals

Live Twitter Feed Join the conversation!

Follow us: @datablueprint

@paiken Ask questions and

submit your comments: #dataed

• 30+ years in data management • Repeated international recognition • Founder, Data Blueprint (datablueprint.com) • Associate Professor of IS (vcu.edu)

• DAMA International (dama.org) • 9 books and dozens of articles • Experienced w/ 500+ data

management practices • Multi-year immersions:

– US DoD (DISA/Army/Marines/DLA) – Nokia – Deutsche Bank – Wells Fargo – Walmart – …

Peter Aiken, Ph.D.

• DAMA International President 2009-2013

• DAMA International Achievement Award 2001 (with Dr. E. F. "Ted" Codd

• DAMA International Community Award 2005

PETER AIKEN WITH JUANITA BILLINGSFOREWORD BY JOHN BOTTEGA

MONETIZINGDATA MANAGEMENT

Unlocking the Value in Your Organization’sMost Important Asset.

The Case for theChief Data OfficerRecasting the C-Suite to LeverageYour Most Valuable Asset

Peter Aiken andMichael Gorman

6Copyright 2016 by Data Blueprint Slide #

Page 4: Metadata Strategies

James Q. Wilson

7

Copyright 2016 by Data Blueprint

• We are to make the best and sanest use of (our resource) - we must first adopt a sober view of man ('s capabilities) … that permits reasonable things to be accomplished, foolish things abandoned and utopian things forgotten

8Copyright 2016 by Data Blueprint Slide #

Business Goals ModelDefines the mission of the enterprise, its long-range goals, and the business policies and assumptions that affect its operations.Business Rules ModelRecords rules that govern the operation of the business and the Business Events that trigger execution of Business Processes.

Enterprise Structure ModelDefines the scope of the enterprise to be modeled. Assigns a name to the model that serves to qualify each component of the model.

Extension Support ModelProvides for tactical Information Model extensions to support special tool needs.

Info Usage ModelSpecifies which of the Entity-Relationship Model component instances are used by other Information Model components.

Global Text ModelSupports recording of extended descriptive text for many of the Information Model components.

DB2 ModelRefines the definition of a Relational Database design to a DB2-specific design.

IMS Structures ModelDefines the component structures and elements and the application program views of an IMS Database.

Flow ModelSpecifies which of the Entity Relationship Model component instances are passed between Process Model components.

Applications Structure ModelDefines the overall scope of an automated Business Application, the components of the application and how they fit together.

Data Structures ModelDefines the data structures and their elements used in an automated Business Application.

Application Build ModelDefines the tools, parameters and environment required to build an automated Business Application.

Derivations/Constraints ModelRecords the rules for deriving legal values for instances of Entity-Relationship Model components, and for controlling the use or existence of E-R instance.

Entity-Relationship ModelDefines the Business Entities, their properties (attributes) and the relationships they have with other Business Entities.

Organization/Location ModelRecords the organization structure and location definitions for use in describing the enterprise.

Process ModelDefines Business Processes, their sub processes and components.

Relational Database ModelDescribes the components of a Relational Database design in terms common to all SAA relational DBMSs.

Test ModelIdentifies the various file (test procedures, test cases, etc.) affiliated with an automated business Application for use in testing that application.

Library ModelRecords the existence of non-repository files and the role they play in defining and building an automated Business Application.

Panel/Screen ModelIdentifies the Panels and Screens and the fields they contain as elements used in an automated Business Application.

Program Elements ModelIdentifies the various pieces and elements of application program source that serve as input to the application build process.

Value Domain ModelDefines the data characteristics and allowed values for information items.

Strategy ModelRecords business strategies to resolve problems, address goals, and take advantage of business opportunities. It also records the actions and steps to be taken.Resource/Problem Model

Identifies the problems and needs of the enterprise, the projects designed to address those needs, and the resources required.

Process Model

Extension Support Model

Application Structure

Model

DB2 Model

Relational Database

Model

Global Text Model

Strategy Model

Derivations/ Constriants

Model

Application Build Model

Test Model Panel/ Screen Model

IMS Structure Model

Data Structure

Model

Program Elements

Model

Business ModelGoals

Organization/ LocationModel

Resource/ Problem

Model

Enterprise Structure

Model

Entity- Relationship

Model

Info Usage Model

Value Domain Model

Flow Model

Business Rules Model

LibraryModel

IBM's AD/Cycle Information Model

Page 5: Metadata Strategies

Whither the "data dictionary?

9

Copyright 2016 by Data Blueprint

• The classic "data dictionary" pretty much died by the early '90s - ... they ever-so-kindly renamed "data dictionary" to "metadata repository" & then promptly

went belly up. Ask pretty much and IBMer today if they've ever heard of AD/Cycle or RepositoryManager... guaranteed response will be a blank stare.

• My calculation says 5% survival rate from 1973 to 2003

• Metadata has morphed into a meaningless buzzword … - Yet organizations are suffering from unprecedented amounts of new forms of seriously

unmanaged metadata.

• I've essentially given up on trying to grok what metadata is other than "required buzzword" (Dave Eddy - [email protected])

Metadata Management Strategies

10

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Page 6: Metadata Strategies

UsesUsesReuses

What is data management?

11Copyright 2016 by Data Blueprint Slide #

Sources

Data Engineering

Data Delivery

Data

Storage

Specialized Team Skills

Data Governance

Understanding the current and future data needs of an enterprise and making that data effective and efficient in supporting business activitiesAiken, P, Allen, M. D., Parker, B., Mattia, A., "Measuring Data Management's Maturity: A Community's Self-Assessment" IEEE Computer (research feature April 2007)

Data management practices connect data sources and uses in an organized and efficient manner • Engineering • Storage • Delivery • Governance

When executed, engineering, storage, and delivery implement governance

Note: does not well-depict data reuse

What is data management?

12Copyright 2016 by Data Blueprint Slide #

Sources

Data Engineering

Data Delivery

Data

Storage

Specialized Team Skills

Resources

(optimized for reuse)

Data Governance

Ana

lytic

Insi

ght

Specialized Team Skills

Page 7: Metadata Strategies

Data$Management$Strategy

Data Management GoalsCorporate CultureData Management FundingData Requirements Lifecycle

DataGovernance

Governance ManagementBusiness GlossaryMetadata Management

DataQuality

Data Quality FrameworkData Quality Assurance

DataOperations

Standards and ProceduresData Sourcing

Platform$&$Architecture

Architectural FrameworkPlatforms & Integration

Supporting$Processes

Measurement & AnalysisProcess ManagementProcess Quality AssuranceRisk ManagementConfiguration Management

Component Process$Areas

DMM℠ Structure of 5 Integrated DM Practice Areas

Data architecture implementation

Data Governance

Data Management

Strategy

Data Operations

PlatformArchitecture

SupportingProcesses

Maintain fit-for-purpose data, efficiently and effectively

13Copyright 2016 by Data Blueprint Slide #

Manage data coherently

Manage data assets professionally

Data life cycle management

Organizational support

Data Quality

You can accomplish Advanced Data Practices without becoming proficient in the Foundational Data Management Practices however this will: • Take longer • Cost more • Deliver less • Present

greaterrisk(with thanks to Tom DeMarco)

Data Management Practices Hierarchy

Advanced Data

Practices • MDM • Mining • Big Data • Analytics • Warehousing • SOA

Foundational Data Management Practices

Data Platform/Architecture

Data Governance Data Quality

Data Operations

Data Management Strategy

Technologies

Capabilities

14Copyright 2016 by Data Blueprint Slide #

Page 8: Metadata Strategies

Dat

a M

anag

emen

t Bod

y of

Kno

wle

dge

15

Copyright 2016 by Data Blueprint

Data Management

Functions

Metadata Management from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

16

Copyright 2016 by Data Blueprint

Page 9: Metadata Strategies

Metadata Management Strategies

17

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

What is a Strategy?

18

Copyright 2016 by Data Blueprint

• Current use derived from military • "a pattern in a stream of decisions" [Henry Mintzberg]

Page 10: Metadata Strategies

Meta data, Meta-data, or metadata

19

Copyright 2016 by Data Blueprint

• In the history of language, whenever two words are pasted together to form a combined concept initially, a hyphen links them

• With the passage of time, the hyphen is lost. The argument can be made that that time has passed

• There is a copyright on the term "metadata," but it has not been enforced

• So, term is "metadata"

The prefix meta-

1. Situated behind: metacarpus.

2. a. Later in time: metestrus. b. At a later stage of development: metanephros.

3. a. Change; transformation: metachromatism. b. Alternation: metagenesis.

4. a. Beyond; transcending; more comprehensive: metalinguistics. b. At a higher state of development: metazoan.

5. Having undergone metamorphosis: metasomatic.

6. a. Derivative or related chemical substance: metaprotein. b. Of or relating to one of three possible isomers of a benzene ring with two attached chemical groups, in which the carbon atoms with attached groups are separated by one unsubstituted carbon atom: meta-dibromobenzene. Definition of the prefix meta- (Emphasis added – source: American Heritage English Dictionary © 1993 Houghton Mifflin).

Meta

20Copyright 2016 by Data Blueprint Slide #

Page 11: Metadata Strategies

Definitions

21

Copyright 2016 by Data Blueprint

• Metadata is – Everywhere in every data management activity and integral

to all IT systems and applications. – To data what data is to real life. Data reflects real life transactions, events,

objects, relationships, etc. Metadata reflects data transactions, events, objects, relations, etc.

– The data that describe the structure and workings of an organization’s use of information, and which describe the systems it uses to manage that information. [quote from David Hay's book, page 4]

• Data describing various facets of a data asset, for the purpose of improving its usability throughout its life cycle [Gartner 2010]

• Metadata unlocks the value of data, and therefore requires management attention [Gartner 2011]

• Metadata Management is – The set of processes that ensure proper creation, storage, integration, and

control to support associated use of metadata

Analogy: a library card catalog

22

Copyright 2016 by Data Blueprint

• Identifies – What books are in the library, and – Where they are located

• Search by – Subject area – Author, or – Title

• Catalog shows – Author – Subject tags – Publication date and – Revision history

• Determine which books will meet the reader’s requirements

• Without the catalog, finding things is difficult, time consuming and frustratingfrom The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 12: Metadata Strategies

Definition (continued)

23

Copyright 2016 by Data Blueprint

• Metadata is the card catalog in a managed data environment

• Abstractly, Metadata is the descriptive tags or context on the data (the content) in a managed data environment

• Metadata shows business and technical users where to find information in data repositories

• Metadata provides details on where the data came from, how it got there, any transformations, and its level of quality

• Metadata provides assistance with what the data really means and how to interpret it

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Defining Metadata

24

Copyright 2016 by Data Blueprint

Metadata is any combination of any circle and the data in the center that unlocks the value of the data!

Adapted from Brad Melton

Data

WhereWhy

What How

Who

When

Data

Page 13: Metadata Strategies

Library Metadata Example

Libraries can operate efficiently through careful use of metadata (Card Catalog)Who: Author What: Title Where: Shelf Location

When: Publication Date

A small amount of metadata (Card Catalog) unlocks the value of a large amount of data (the Library)

25

Copyright 2016 by Data Blueprint

Data

WhereWhy

What How

Who

When

Library Book

Outlook Example

26

Copyright 2016 by Data Blueprint

"Outlook" metadata is used to navigate/manage email

What: "Subject" How: "Priority" Where: "USERID/Inbox", "USERID/Personal" Why: "Body" When: "Sent" & "Received”

• Find the important stuff/weed out junk • Organize for future access/outlook rules • Imagine how managing e-mail (already non-trivial)

would change if Outlook did not make use of metadata Who: "To" & "From?"

Page 14: Metadata Strategies

Metadata Defined …• Metadata …

• isn't

• is not a noun

• is more of a verb

• Describes a use of data - not a type of data

• Describes the use of some attributes of data to understand or manage the data from a different (usually higher) level of abstraction

27Copyright 2016 by Data Blueprint Slide #

Why Metadata Matters

28

Copyright 2016 by Data Blueprint

• They know you rang a phone sex service at 2:24 am and spoke for 18 minutes. But they don't know what you talked about.

• They know you called the suicide prevention hotline from the Golden Gate Bridge. But the topic of the call remains a secret.

• They know you spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don't know what was discussed.

• They know you received a call from the local NRA office while it was having a campaign against gun legislation, and then called your senators and congressional representatives immediately after. But the content of those calls remains safe from government intrusion.

• They know you called a gynecologist, spoke for a half hour, and then called the local Planned Parenthood's number later that day. But nobody knows what you spoke about. – https://www.eff.org/deeplinks/2013/06/why-metadata-matters

Page 15: Metadata Strategies

Entity: BED Data Asset Type: Principal Data Entity Purpose: This is a substructure within the Room

substructure of the Facility Location. It contains information about beds within rooms.

Source: Maintenance Manual for File and Table Data (Software Version 3.0, Release 3.1)

Attributes: Bed.Description Bed.Status Bed.Sex.To.Be.Assigned Bed.Reserve.Reason

Associations: >0-+ Room Status: Validated

• A purpose statement describing why the organization is maintaining information about this business concept;

• Sources of information about it; • A partial list of the attributes or characteristics of the entity; and • Associations with other data items; this one is read as "One room contains zero

or many beds."

29Copyright 2016 by Data Blueprint Slide #

A sample data entity and associated metadata

Metadata Management Strategies

30

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Page 16: Metadata Strategies

Metadata Management Strategies

31

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Types of Metadata: Process Metadata

32

Copyright 2016 by Data Blueprint

• Process Metadata is... – Data that defines and describes the characteristics of other system

elements, e.g. processes, business rules, programs, jobs, tools, etc.

• Examples of Process metadata:

– Data stores and data involved

– Government/regulatory bodies

– Organization owners and stakeholders

– Process dependencies and decomposition

– Process feedback loop and documentation

– Process name

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 17: Metadata Strategies

Business Process Metadata

33

Copyright 2016 by Data Blueprint

Who: Created the documentation?

What: Are the important dependencies among the processes?

How: Do the business processes interact with each other?

Data

WhereWhy

What How

Who

When

Email Messag

e

Types of Metadata: Business Metadata

34

Copyright 2016 by Data Blueprint

• Business Metadata describe to the end user what data are available, what they mean and how to retrieve them.

• Included are:

– Business names and definitions of subject and concept areas, entities, attributes

– Attribute data types and other attribute properties

– Range descriptions, calculations, algorithms and business rules

– Valid domain values and their definitions

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 18: Metadata Strategies

Types of Metadata: Technical & Operational Metadata

35

Copyright 2016 by Data Blueprint

• Technical and operational metadata provides developers and technical users with information about their systems

• Technical metadata includes… – Physical database table and column names, column properties, other

properties, other database object properties and database storage

• Operational metadata is targeted at IT operations users’ needs, including… – Information about data movement, source and target systems, batch

programs, job frequency, schedule anomalies, recovery and backup information, archive rules and usage

• Examples of Technical & Operational metadata: – Audit controls and balancing information – Data archiving and retention rules – Encoding/reference table conversions – History of extracts and results

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Types of Metadata: Data Stewardship

36

Copyright 2016 by Data Blueprint

• Data stewardship metadata is about... – Data stewards, stewardship processes, and responsibility

assignments

• Data stewards… – Assure that data and Metadata are accurate, with high quality

across the enterprise. – Establish and monitor data sharing.

• Examples of Data stewardship metadata: – Business drivers/goals – Data CRUD rules – Data definitions – business and technical – Data owners – Data sharing rules and agreements/contracts – Data stewards, roles and responsibilities

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 19: Metadata Strategies

Types of Metadata: Provenance

37

Copyright 2016 by Data Blueprint

• Provenance: – the history of ownership of a valued object or work of art or

literature" [Merriam Webster] – For each datum, this is the description of:

• Its source (system or person or department), • Any derivation used, and • The date it was created.

– Examples of Data Provenance: • The programs or

processes by which it was created

• Its owner • The steward responsible

for its quality • Other roles and

responsibilities • Rules for sharing it

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Metadata Subject AreasSubject Areas Components

1) Business Analytics Data definitions, reports, users, usage, performance

2) Business Architecture Roles and organizations, goals and objectives

3) Business Definitions Business terms and explanations for a particular concept, fact, or other item found in an organization

4) Business Rules Standard calculations and derivation methods

5) Data Governance Policies, standards, procedures, programs, roles, organizations, stewardship assignments

6) Data Integration Sources, targets, transformations, lineage, ETL workflows, EAI, EII, migration/conversion

7) Data Quality Defects, metrics, ratings

8) Document Content Management

Unstructured data, documents, taxonomies, ontologies, name sets, legal discovery, search engine indexes

38

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 20: Metadata Strategies

Metadata Subject Areas, continuedSubject Areas Components

9) Information Technology Infrastructure Platforms, networks, configurations, licenses

10)Conceptual data models Entities, attributes, relationships and rules, business names and definitions.

11)Logical Data Models Files, tables, columns, views, business definitions, indexes, usage, performance, change management

12)Process Models Functions, activities, roles, inputs/outputs, workflow, timing, stores

13)Systems Portfolio and IT Governance

Databases, applications, projects, and programs, integration roadmap, change management

14)Service-oriented Architecture (SOA) information:

Components, services, messages, master data

15)System Design and Development Requirements, designs and test plans, impact

16)Systems Management Data security, licenses, configuration, reliability, service levels

39

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Metadata Management Strategies

40

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Page 21: Metadata Strategies

7 Metadata Benefits

41

Copyright 2016 by Data Blueprint

1. Increase the value of strategic information (e.g. data warehousing, CRM, SCM, etc.) by providing context for the data, thus aiding analysts in making more effective decisions.

2. Reduce training costs and lower the impact of staff turnover through thorough documentation of data context, history, and origin.

3. Reduce data-oriented research time by assisting business analysts in finding the information they need in a timely manner.

4. Improve communication by bridging the gap between business users and IT professionals, leveraging work done by other teams and increasing confidence in IT system data.

5. Increased speed of system development’s time-to-market by reducing system development life-cycle time.

6. Reduce risk of project failure through better impact analysis at various levels during change management.

7. Identify and reduce redundant data and processes, thereby reducing rework and use of redundant, out-of-data, or incorrect data.

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Metadata for Semistructured Data

42

Copyright 2016 by Data Blueprint

• Unstructured data – Any data that is not in a database or data file, including documents or other

media data • Metadata describes both structured and unstructured data • Metadata for unstructured data exists in many formats,

responding to a variety of different requirements • Examples of Metadata repositories describing unstructured data:

– Content management applications – University websites – Company intranet sites – Data archives – Electronic journals collections – Community resource lists

• Common method for classifying Metadata in unstructured sources is to describe them as descriptive metadata, structural metadata, or administrative metadata

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 22: Metadata Strategies

Metadata for Unstructured Data: Examples

43

Copyright 2016 by Data Blueprint

• Examples of descriptive metadata: – Catalog information – Thesauri keyword terms

• Examples of structural metadata – Dublin Core – Field structures – Format (audio/visual, booklet) – Thesauri keyword labels – XML schemas

• Examples of administrative metadata – Source(s) – Integration/update schedule – Access rights – Page relationships (e.g. site navigational design)

Specific Example

44

Copyright 2016 by Data Blueprint

• Four metadata sources:

1. Existing reference models (i.e., ADRM)

2. Conceptual model created two years ago

3. Existing systems (to be reverse engineered)

4. Enterprise data model

} from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 23: Metadata Strategies

The Real Value of Metadata

45Copyright 2016 by Data Blueprint Slide #

Metadata Management Strategies

46

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Page 24: Metadata Strategies

Metadata Strategy Implementation Phases

47

Copyright 2016 by Data Blueprint

Investing in Metadata?• How can IT staff convince managers to plan,

budget, and apply resources for metadata management?

• What is metadata and why is it important?

• What technologies are involved?

• Internet and intranet technologies are part of the answer and will get the immediate attention of management.

48Copyright 2016 by Data Blueprint Slide #

Page 25: Metadata Strategies

Metadata Strategy

49

Copyright 2016 by Data Blueprint

• Metadata Strategy is – A statement of direction in Metadata management by the enterprise – A statement of intend that acts as a reference framework for the development

teams – Driven by business objectives and prioritized by the business value they bring to

the organization

• Build a Metadata strategy from a set of defined components • Primary focus of Metadata strategy

– gain an understanding of and consensus on the organization’s key business drivers, issues, and information requirements for the enterprise Metadata program

• Need to understand how well the current environment meets these requirements now and in the future

• Metadata strategy objectives define the organization’s future enterprise metadata architecture and recommend logical progression of phased implementation steps

• Only 1 in 10 organizations has a documented, board approved data strategy

Keep the proper focus

50

Copyright 2016 by Data Blueprint

• Wrong question:

– Is this metadata?

• Right question:

– Should we include this data item within the scope of our metadata practices?

Page 26: Metadata Strategies

Extraction Sources

Copyright 2013 by Data Blueprint

Organized Knowledge 'Data'

Improved Quality Data

Data Organization Practices

Metadata Practices will be inextricably intertwined with Data Quality and Master Data and Knowledge Management, (among other functions)

Operational Data

Data Quality Engineering

Master Data Management

Practices

Suspected/Identified

Data Quality

Problems

Routine Data Scans

Master Data Catalogs

Routine Data Scans

KnowledgeManagement

Practices

Data that might benefit from Master Management

51

System 2

System 3

System 4

System 5

System 6

System 1

Existing

Metadata Strategy - Reduce the Number of Systems

52Copyright 2016 by Data Blueprint Slide #

Page 27: Metadata Strategies

System 2

System 3

System 4

System 5

System 6

System 1

Existing New

TransformationsData Store

Generated Programs

System-to-System Program Transformation Knowledge

Transformations

Transformations

Transformations

Metadata Strategy - Reduce the Number of Systems

53Copyright 2016 by Data Blueprint Slide #

FTI Metadata Model

54Copyright 2016 by Data Blueprint Slide #

Page 28: Metadata Strategies

Build Your Own Metadata Repository

55Copyright 2016 by Data Blueprint Slide #

Sample Low-tech

Repository

56Copyright 2016 by Data Blueprint Slide #

Page 29: Metadata Strategies

Copyright 2013 by Data Blueprint 57

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

• Determine what type of metadata to derive for specific system implementation requirements. – Implementation needs indicate a requirement to understand the

workflow metadata by obtaining a list of all combinations of stepname instances along with each associated component, business process, and home pages.

58Copyright 2016 by Data Blueprint Slide #

Page 30: Metadata Strategies

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

59Copyright 2016 by Data Blueprint Slide #

2. Formulate a query designed to extract specific metadata.

– Using SQL extract from the system all unique combinations of homepages, processes, components, and step names resulting in a 13044 line report.

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

60Copyright 2016 by Data Blueprint Slide #

3. Export the extracted metadata to a spreadsheet for subsequent complexity analysis and validation.

– Saving the query results into .xls format, permits further statistical analysis of the metadata to determine information about metadata relationships, complexity and occurrence frequency in order to confirm the extracted metadata correctness.

Page 31: Metadata Strategies

Example Query Outputs

61Copyright 2016 by Data Blueprint Slide #

Bibiana Duet's

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

62Copyright 2016 by Data Blueprint Slide #

4. Import the validated metadata into Repository

– Once validated, the process metadata is moved into the MS-Access database as a new stand-alone table.

Page 32: Metadata Strategies

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

63Copyright 2016 by Data Blueprint Slide #

5. Integrate the new metadata with the existing metadata.

– The new table is formally associated with the existing metadata, linking processes to menugroups via homepages and stepnames to menubars and panels via menuitems.

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

64Copyright 2016 by Data Blueprint Slide #

6. Provide the resulting, richer metadata to the requesting user, verifying the metadata and its utility addressing the implementation need.

– Enhance Repository reporting capabilities, publish the next version, and work directly with the user group requesting the metadata in order ensure that the metadata is accurate and that it meet their needs.

Page 33: Metadata Strategies

© Copyright 2004 by Data Blueprint - all rights reserved!90 - www.datablueprint.com

MetadataMetadata Engineering Engineering SolutionSolution

New System

Legacy System #1: Payroll

Legacy System #2: Personnel

Data mapping, quality, & transformation

Data mapping, quality, &

transformation

Metadata Engineering Analyses Steps Illustration

65Copyright 2016 by Data Blueprint Slide #

• Last step usually resulted in additional requests for metadata

• Cycle begins again • Extraction was repeated with many analysis variations

– changing the source of the analysis inputs to include different • system objects

• system documentation

• metadata

• Report on associations among any set of systems objects • Define and document metadata and relate these to other

metadata

Metadata Uses: Requirements• Systematically determine the requirements that the

PeopleSoft enterprise software could meet • Document discrepancies between system

capabilities and organizational needs • Panels presented to users in JAD-like sessions that

were organized using system structure metadata • Functional users determined and certified the

overall system functionality • Associating requirements with components • Discrepancies were noted for subsequent

investigation and resolution

66Copyright 2016 by Data Blueprint Slide #

Page 34: Metadata Strategies

Metadata Uses: System Changes ...

• Evaluated proposed system changes, modifications, and enhancements

• Metadata types used to assess the magnitude of proposed changes

• For example: what are number of panels requiring modification if a given field length was doubled?

• Analyze the costs of changing the system versus changing the organizational processes

67Copyright 2016 by Data Blueprint Slide #

Metadata Uses: Practice Analysis

• Identify gaps between the DP&T/DOA business requirements and PeopleSoft

• Process components were mapped to user activities and workgroup practices

• Users focused their attention on relevant portions of the system

• For example, the payroll clerks accessed the metadata to determine which panels 'belonged' to them.

68Copyright 2016 by Data Blueprint Slide #

Page 35: Metadata Strategies

Metadata Uses: Realignment

• Realignment addressed gaps between functionality and existing work practices

• Once users understood the system’s functionality and navigate through process component steps

• Compared the system’s inputs and outputs with their own information needs

• If gaps existed, metadata used to assess the relative magnitude of proposed changes

• Forecast system customization costs • Evidence for changing the business practice instead of the

system

69Copyright 2016 by Data Blueprint Slide #

Metadata Uses: Training

• Training specialists used mappings to determine relevant combinations of panels, menuitems, and menubars

• Display panels in the sequence expected by the system users

• Users were able to swiftly become familiar with their 'areas'

• Screen session recording and playback capabilities

70Copyright 2016 by Data Blueprint Slide #

Page 36: Metadata Strategies

Metadata Uses: Additional Metadata

• Metadata describing LS1 & LS2 • Metadata supporting data conversion

– initial motivation for the metadata development – each decision to convert a data item was recorded, permitting the

tracking of the number of data items that had been mapped, converted, and to what they had been converted

• Associations with system batch reporting programs called SQRs

• User and user type metadata

71Copyright 2016 by Data Blueprint Slide #

Metadata Uses: Database Design

• CASE tool integrated to extract the database design information directly from the physical database

• Integrated into TheMAT • Decomposition of the physical database into logical user

views • Document how user requirements were implemented by

the system • Planning security access levels and privileges

72Copyright 2016 by Data Blueprint Slide #

Page 37: Metadata Strategies

Metadata Uses: Statistical Analysis

• Guiding metadata-based data integration from the two legacy systems

• For example, the ERD information was used to map the legacy system data into PeopleSoft data structures

• Statistical summaries described the new system to users

73Copyright 2016 by Data Blueprint Slide #

0 500 1000 1500 2000 2500

Manage Positions (2%)

Plan Careers (~5%)

Administer Training (~5%)

Plan Successions (~5%

Manage Competencies (20%)

Recruit Workforce (62%)

Develop Workforce (29.9%)Administer Workforce (28.8%)

Compensate Employees (23.7%)Monitor Workplace (8.1%)

Define Business (4.4%)Target System (3.9%)

EDI Manager (.9%)Target System Tools (.3%)

Administer Workforce

Metadata Uses

74Copyright 2016 by Data Blueprint Slide #

Page 38: Metadata Strategies

Comparing Mapping Approaches

75Copyright 2016 by Data Blueprint Slide #

Legacy System New System

Analysis is unfocused

(many to many)

(one to many)(many to one)Analysis is structured with the form of the problem

dictating the form of the solution

Marco & Jennings's Complete Meta Data Model

76Copyright 2016 by Data Blueprint Slide #Source:http://dmreview.com/article_sub.cfm?articleID=1000941 used with permission

Page 39: Metadata Strategies

Metadata Management Strategies

77

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Goals and Principles

78

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

• Provide organizational understanding of terms and usage

• Integrate Metadata from diverse sources

• Provide easy, integrated access to metadata

• Ensure Metadata quality and security

Page 40: Metadata Strategies

Activities

79

Copyright 2016 by Data Blueprint

• Understand Metadata requirements

• Define the Metadata architecture

• Develop and maintain Metadata standards

• Implement a managed Metadata environment

• Create and maintain metadata

• Integrate metadata

• Management Metadata repositories

• Distribute and deliver metadata

• Query, report and analyze metadata

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Activities: Metadata Standards Types

80

Copyright 2016 by Data Blueprint

• Two major types: – Industry or

consensus standards

– International standards

• High level framework can show – How standards are

related – How they rely on

each other for context and usage

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 41: Metadata Strategies

Activities: Noteworthy Metadata Standards Types

Warehouse Process Warehouse Operation

Transformation OLAP Data Mining

Information Visualization

Business Nomenclature

Object Model Relational Record Multidimensional XML

Business Information Data Types Expression

Keys and Indexes

Type Mapping

Software Deployment

Object Model

Management

Analysis

Resource

Foundation

81

Copyright 2016 by Data Blueprint

• Common Warehouse Metadata (CWM): • Specifies the interchange of Metadata among data

warehousing, BI, KM, and portal technologies. • Based on UML and depends on it to represent object-

oriented data constructs. • The CWM Metamodel

Information Management Metamodel (IMM)

82

Copyright 2016 by Data Blueprint

• Object Management Group Project to replace CWM

• Concerned with: – Business Modeling

• Entity/relationship metamodel

– Technology modeling • Relational Databases • XML • LDAP

– Model Management • Traceability

– Compatibility with related models

• Semantics of business vocabulary and business rules

• Ontology Definition Metamodel

• Based on Core model • Used to translate from

one model to another

Page 42: Metadata Strategies

Primary Deliverables

83

Copyright 2016 by Data Blueprint

• Metadata repositories

• Quality metadata

• Metadata analysis

• Data lineage

• Change impact analysis

• Metadata control procedures

• Metadata models and architecture

• Metadata management operational analysis

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Roles and Responsibilities

84

Copyright 2016 by Data Blueprint

• Suppliers:

– Data Stewards

– Data Architects

– Data Modelers

– Database Administrators

– Other Data Professionals

– Data Brokers

– Government and Industry Regulators

• Participants:

– Metadata Specialists

– Data Integration Architects

– Data Stewards

– Data Architects and Modelers

– Database Administrators

– Other DM Professionals

– Other IT Professionals

– DM Executives

– Business Users

• Consumers:

– Data Stewards

– Data Professionals

– Other IT Professionals

– Knowledge Workers

– Managers and Executives

– Customers and Collaborators

– Business Users

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 43: Metadata Strategies

Technology

85

Copyright 2016 by Data Blueprint

• Metadata repositories • Data modeling tools • Database management systems • Data integration tools • Business intelligence tools • System management tools • Object modeling tools • Process modeling tools • Report generating tools • Data quality tools • Data development and administration tools • Reference and mater data management tools

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Metadata Management Strategies

86

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Page 44: Metadata Strategies

15 Guiding Principles

87

Copyright 2016 by Data Blueprint

1. Establish and maintain a Metadata strategy and appropriate policies, especially clear goals and objectives for Metadata management and usage

2. Secure sustained commitment, funding, and vocal support from senior management concerning Metadata management for the enterprise

3. Take an enterprise perspective to ensure future extensibility, but implement through iterative and incremental delivery

4. Develop a Metadata strategy before evaluating, purchasing, and installing Metadata management products

5. Create or adopt Metadata standards to ensure interoperability of Metadata across the enterprise

6. Ensure effective Metadata acquisition for internal and external metadata

7. Maximize user access since a solution that is not accessed or is under-accessed will not show business value

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

15 Guiding Principles, continued

88

Copyright 2016 by Data Blueprint

8. Understand and communicate the necessity of Metadata and the purpose of each type of metadata; socialization of the value of Metadata will encourage business usage

9. Measure content and usage 10.Leverage XML, messaging and web services 11.Establish and maintain enterprise-wide business involvement

in data stewardship, assigning accountability for metadata 12.Define and monitor procedures and processes to ensure

correct policy implementation 13.Include a focus on roles, staffing,

standards, procedures, training, & metrics 14.Provide dedicated Metadata experts

to the project and beyond 15.Certify Metadata quality

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 45: Metadata Strategies

Who is Joan Smith?

89Copyright 2016 by Data Blueprint Slide #

© Copyright 06/05/10 by Data Blueprint - all rights reserved!PMS4- - datablueprint.com

Select an Attribute to get a list of values

Double-click a value to see rows with that value

Page 46: Metadata Strategies

Metadata Management Strategies

91

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Example: iTunes Metadata

• Example: – iTunes

Metadata • Insert a recently

purchased CD • iTunes can:

– Count the number of tracks (25)

– Determine the length of each track

92Copyright 2016 by Data Blueprint Slide #

Page 47: Metadata Strategies

Example: iTunes Metadata

• When connected to the Internet iTunes connects to the Gracenote(.com) Media Database and retrieves: – CD Name – Artist – Track Names – Genre – Artwork

• Sure would be a pain to type in all this information

93Copyright 2016 by Data Blueprint Slide #

Example: iTunes Metadata

• To organize iTunes – I create a "New Smart Playlist" for

Artist's containing "Miles Davis"

94Copyright 2016 by Data Blueprint Slide #

Page 48: Metadata Strategies

Example: iTunes Metadata

• Notice I didn't get the desired results • I already had another Miles Davis recording in iTunes • Must fine-tune the request to get the desired results

– Album contains "The complete birth of the cool"

• Now I can move the playlist "Miles Davis" to a folder

95Copyright 2016 by Data Blueprint Slide #

• The same: – Interface –Processing –Data Structures

• are applied to –Podcasts –Movies –Books –.pdf files

• Economies of scale are enormous

Example: iTunes Metadata

96Copyright 2016 by Data Blueprint Slide #

Page 49: Metadata Strategies

Metadata Management Strategies

97

Copyright 2016 by Data Blueprint

1. Data Management Overview

2. What is metadata and why is it important?

3. Major metadata types & subject areas

4. Metadata benefits, application & sources

5. Metadata strategies & implementation

6. Metadata building blocks

7. Guiding Principles

8. Specific teachable example

9. Take Aways, References and Q&A

Tweeting now: #dataed

Uses

Metadata Take Aways

98

Copyright 2016 by Data Blueprint

• Metadata unlocks the value of data, and therefore requires management attention [Gartner 2011]

• Metadata is the language of data governance • Metadata defines the essence of integration challenges

Sources Metadata Governance

Metadata Engineering

Metadata Delivery

Metadata Practices

Metadata Storage

Specialized Team Skills

Page 50: Metadata Strategies

Dat

a M

anag

emen

tB

ody

of K

now

ledg

e

99

Copyright 2016 by Data Blueprint

Data Management

Functions

Metadata Management Summary

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

100

Copyright 2016 by Data Blueprint

Page 51: Metadata Strategies

101Copyright 2016 by Data Blueprint Slide #

Business Goals ModelDefines the mission of the enterprise, its long-range goals, and the business policies and assumptions that affect its operations.Business Rules ModelRecords rules that govern the operation of the business and the Business Events that trigger execution of Business Processes.

Enterprise Structure ModelDefines the scope of the enterprise to be modeled. Assigns a name to the model that serves to qualify each component of the model.

Extension Support ModelProvides for tactical Information Model extensions to support special tool needs.

Info Usage ModelSpecifies which of the Entity-Relationship Model component instances are used by other Information Model components.

Global Text ModelSupports recording of extended descriptive text for many of the Information Model components.

DB2 ModelRefines the definition of a Relational Database design to a DB2-specific design.

IMS Structures ModelDefines the component structures and elements and the application program views of an IMS Database.

Flow ModelSpecifies which of the Entity Relationship Model component instances are passed between Process Model components.

Applications Structure ModelDefines the overall scope of an automated Business Application, the components of the application and how they fit together.

Data Structures ModelDefines the data structures and their elements used in an automated Business Application.

Application Build ModelDefines the tools, parameters and environment required to build an automated Business Application.

Derivations/Constraints ModelRecords the rules for deriving legal values for instances of Entity-Relationship Model components, and for controlling the use or existence of E-R instance.

Entity-Relationship ModelDefines the Business Entities, their properties (attributes) and the relationships they have with other Business Entities.

Organization/Location ModelRecords the organization structure and location definitions for use in describing the enterprise.

Process ModelDefines Business Processes, their sub processes and components.

Relational Database ModelDescribes the components of a Relational Database design in terms common to all SAA relational DBMSs.

Test ModelIdentifies the various file (test procedures, test cases, etc.) affiliated with an automated business Application for use in testing that application.

Library ModelRecords the existence of non-repository files and the role they play in defining and building an automated Business Application.

Panel/Screen ModelIdentifies the Panels and Screens and the fields they contain as elements used in an automated Business Application.

Program Elements ModelIdentifies the various pieces and elements of application program source that serve as input to the application build process.

Value Domain ModelDefines the data characteristics and allowed values for information items.

Strategy ModelRecords business strategies to resolve problems, address goals, and take advantage of business opportunities. It also records the actions and steps to be taken.Resource/Problem Model

Identifies the problems and needs of the enterprise, the projects designed to address those needs, and the resources required.

Process Model

Extension Support Model

Application Structure

Model

DB2 Model

Relational Database

Model

Global Text Model

Strategy Model

Derivations/ Constriants

Model

Application Build Model

Test Model Panel/ Screen Model

IMS Structure Model

Data Structure

Model

Program Elements

Model

Business ModelGoals

Organization/ LocationModel

Resource/ Problem

Model

Enterprise Structure

Model

Entity- Relationship

Model

Info Usage Model

Value Domain Model

Flow Model

Business Rules Model

LibraryModel

IBM's AD/Cycle Information Model

References & Recommended Reading

102

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 52: Metadata Strategies

References, cont’d

103

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

References, cont’d

104

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 53: Metadata Strategies

References, cont’d

105

Copyright 2016 by Data Blueprint

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Questions?

It’s your turn! Use the chat feature or Twitter (#dataed) to submit

your questions to Peter now.

+ =

106

Copyright 2016 by Data Blueprint

Page 54: Metadata Strategies

Upcoming Events

107

Copyright 2016 by Data Blueprint

Governing the Business Vocabulary – aligning the requirements of the business and IT to achieve a shared understanding of data across an organization June 27, 2016 @ 8:30 AM ETSan Diego, CAhttp://www.debtechint.com

Data Modeling Fundamentals June 14, 2016 @ 2:00 PM ET/11:00 AM PT

Sign up here: www.datablueprint.com/webinar-schedule or www.dataversity.net