tracking the enterprise data landscape todd sicard dama-mn may 19, 2010 © 2010 blue cross and blue...

29
Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

Upload: darren-cunningham

Post on 02-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

Tracking the Enterprise Data Landscape

Todd Sicard

DAMA-MN

May 19, 2010

© 2010 Blue Cross and Blue Shield of Minnesota

Page 2: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

2

Today’s Purpose

> Keeping track of 100’s of databases is difficult. Without a map it means constant re-discovery (at best) or making mistakes (at worst).

> Today I’ll outline the metamodel of a rich topographic map of an enterprise-level data landscape to keep track of 100’s of databases.

Page 3: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

3

Who Am I?

> Todd Sicard– [email protected]

> Started at Blue Cross in 1993

> Enterprise Data Architect 2004-2009

> Enterprise Architect 2010+

> CDMP

Page 4: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

4

Goal

Create an overall model of - what data is stored where, - whose data it is, - when it arrives, - where it came from, and - which technology it uses.

>Collect, don’t forget

>Keep it one-person simple

>Useful, Usable, Used

Datastore

Technology State

Information

Line of Business

Lineage

Page 5: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

5

This isn’t column level or table level…

This is database-level metadata.

Breadth before depth

Accuracy before precision

It had to be one-person-able.

Page 6: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

6

By doing this, you will be able to…

1) Understand

2) Manage

3) Leverage

> You can’t leverage what you don’t manage, and you can’t manage what you don’t understand!

Page 7: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

7

Datastore

> A datastore is any electronic (?) repository of structured (?) information.

(Not all structured data is in a database)(Not all important data is always electronic)(Not all important data is structured)

> A list of all the logical names using the most common and accurate vernacular

> Data System: A collection of datastores.– Composition: Essential to the definition.– Aggregation: Non-essential to the definition, usually a

collection of independent datastores.

Page 8: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

8

Datastores

ARC-DB"As-Received Claim Database“ Started in 1995 as an MS Access DB, then converted to RDBMS. Contains 24 rolling months of claims data.

Business Owner: Warren Buffet

Business SME: Blarfengaar B.

Technical Owner: Bill Gates

Technical SME: Steve Hoberman

class Claims

«DataSystem»Claims

«DataStore»ARC-DB::ARC-DB

«DataStore»As-Paid DB::As-Paid DB

Page 9: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

9

Information ModelsDomain - DDM

Subject - SAM

Concept - CDM

Entity - LDM

Table - PDM

Model Purpose Type Description LinksInformation Semantic Domain 10 - 14 for the enterprise AssociationsInformation Semantic Subject 8-12 per domain Associations

Data Structural Conceptual Key entities Crow's footData Structural Logical Technology independent Crow's footData Structural Physical Specific implementation Crow's foot

Data Models

>“What” data does it contain?

Page 10: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

10

Information Models

class Your Enterprise Data Domain Model

«Domain»Claims

«Domain»Doctors

«Domain»Subscribers

«Domain»Customers

«Domain»Health Benefits

class Claims SAM

«Subject»Claims::Patient

«Subject»Claims::Rendered

Serv ices

«Subject»Claims::

Subscriber

«Subject»Claims::Prov ider

«Subject»Claims::

Adjudication

Data Domain Model The Claim Subject Area Model

Page 11: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

11

Information Models: “Scope + 1”class Claims SAM

«Subject»Claims::Patient

«Subject»Claims::Rendered

Serv ices

«Subject»Claims::

Subscriber

«Subject»Claims::Prov ider

«Subject»Claims::

Adjudication

«Subject»Customers::Experience

«Subject»Subscribers::

Subscriber

«Subject»Subscribers::

Cov ered Person

«Subject»Enrollment

«Subject»Customers::

Cov ered Customer

The Claim Subject Area Model “Plus One”

Page 12: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

12

A Datastore’s Subject Area Model (SAM)class As-Paid DB

«DataStore»As-Paid DB

«Subject»Information::

Claims::Patient

«Subject»Information::

Claims::Rendered Serv ices

«Subject»Information::

Claims::Subscriber

«Subject»Information::

Claims::Prov ider

«Subject»Information::

Claims::Adjudication

Paid Medical Claims

«SOR»

class Claims SAM

«Subject»Claims::Patient

«Subject»Claims::Rendered

Serv ices

«Subject»Claims::

Subscriber

«Subject»Claims::Prov ider

«Subject»Claims::

Adjudication

Page 13: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

13

“Line of Business”

>“Whose” data is it?

A poor name for a mix of stuff:

–Industry Subtypes–Corporate Legal Entity Structure

–Product Lines–Market Segments–External Data Actors

Regulations Industry

Etc.

Affiliates and Partners

Core Corporations

Product Lines

External Data Actors

Market Segments

Page 14: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

14

LOB’s…

uc Doctors

Medical Prov ider

Professional Institution

uc Product Lines

Health Plans

MedicareIndiv idual

Commercial

Large Group

Public

Small Group

Page 15: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

15

Datastore LOB

class As-Paid DB

«DataStore»As-Paid DB

Commercial

(from Product Lines)

Medical Prov ider

(from Doctors)

«trace»

«trace»

uc Doctors

Medical Prov ider

Professional Institution

uc Product Lines

Health Plans

MedicareIndiv idual

Commercial

Large Group

Public

Small Group

Page 16: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

16

Lineage - “Database, Database, Flow.”

>“Where” does the data come from?

No matter:–How it moves,–How it’s transformed,–How it’s rolled up,–How big it is,–How mangled it becomes…

…It’s just a data flow

A B

Process

Retrieve Load

MQ Service

I don’t care!

Page 17: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

17

Information moves from A to B… that’s all that matters!

A BMiracle

class As-Paid DB

«DataStore»As-Paid DB

«DataStore»ARC-DB::ARC-DB F0123 Medical Claims

«flow»

Lineage = “Database, Database, Flow.”

class ARC-DB

«DataStore»ARC-DB

Professional

(from Doctors)

F0321 Medical Claim

«flow»

Page 18: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

18

State

> “When” does the data arrive?

> The relevant lifecycle of a piece of important data with lots of processing.

> Generic lifecycle:1. Creation,

2. Formation,

3. Maturity,

4. Destruction

stm State

Receiv ed

Initial

Final

Processed

Paid

Rejected

Denied

AdjudicationOutcome

EditOutcome

Page 19: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

19

Datastore State

class ARC-DB

«DataStore»ARC-DB

Receiv ed

(from Claim)

«trace»

class As-Paid DB

«DataStore»As-Paid DB

Paid

(from Claim)

Denied

(from Claim)

«trace»

«trace»

stm State

Receiv ed

Initial

Final

Processed

Paid

Rejected

Denied

AdjudicationOutcome

EditOutcome

Page 20: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

20

class Technology

«Technology»Unknown

«Technology»Structured Flat

File

«Technology»XML

«Technology»Delimited

«Technology»Fixed Width

«Technology»Database

Management System

«Technology»Hierarchical

«Technology»Relational

«Technology»Object

«Technology»Oracle

«Technology»DB2

«Technology»SQL Serv er

«Technology»Structured Data

Technology

Page 21: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

21

Datastore Technology

class As-Paid DB

«DataStore»As-Paid DB

«Technology»Technology::

Oracle

«use»

class Technology

«Technology»Unknown

«Technology»Structured Flat

File

«Technology»XML

«Technology»Delimited

«Technology»Fixed Width

«Technology»Database

Management System

«Technology»Hierarchical

«Technology»Relational

«Technology»Object

«Technology»Oracle

«Technology»DB2

«Technology»SQL Server

«Technology»Structured Data

Page 22: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

22

Deployment: Servers, Instances, etc.> I didn’t go there

> Why not?– “One-person-able”– Breadth before depth.– Accuracy before precision.– Understand, Manage, then Leverage– Manage information at the Enterprise-level

> But it sure would be nice… maybe later

Page 23: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

23

The Metamodel (UML Model)class Enterprise Data Landscape Repository

«DataStore»Data Store

«Technology»Data Tech

«Subject»Info Subject

«Domain»Info Domain

«DataSystem»Data System

LOB

LOB - External Data

State

State

«trace»

Lineage

«flow»

LOB (Line of Business)

«trace»

LOB Model

Data Domain Model

Domain Subject Area Model

Composition

Technology Model

Composition

Tech

«use»

Lineage

«flow»

Aggregation

SAM (Subject Area Model)

«SOR»

«flow»Lineage

Data State Model

Page 24: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

24

Tech Model

From

To

Data Store

Data Store Name

Data System Name (FK)Tech (FK)

Data System

Data System Name

Line of Business

LOB

Data Store LOB

Data Store Name (FK)LOB (FK)

LOB Model

LOB (FK)

Data Domain

Domain

Subject Area

Domain (FK)Subject

Domain Model

Domain (FK)

Data Store SAM

Data Store Name (FK)Domain (FK)Subject (FK)

Internal

Data Store Name (FK)

External

Data Store Name (FK)

Technology

Tech

Data Store Lineage

Data Store Name (FK)

Domain SAM

Domain (FK)Subject (FK)

The Metamodel (ER Model)

Page 25: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

25

Drawing the Pictures

> Datastore-centric:LOB, SAM, Lineage, Tech, State, Composition

> Reference (Process POV)

Claims Data Flow

> Project POVIn-scope Datastores(Scope + 1)

class Claims

«DataStore»ARC-DB::ARC-DB

Professional

(from Doctors)

«DataStore»As-Paid DB::As-Paid DB

Medical Prov ider

(from Doctors)

F0321 Medical Claim

«flow»

F0123 Medical Claims

«flow»

F0314 Claim Payment Info

«flow»

Page 26: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

26

Potential Users

>Warehouse architects

>Data modelers

>Data stewards

>DBA's

>Data leadership

>Enterprise architects

>Security architects

>Business continuity planners

>Disaster recovery planners

>Testers

>Internal audit

>Corporate attorneys

Page 27: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

27

Enough talking… let’s see it.

Page 28: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

28

The Tool

But only from a vendor-neutral perspective…

> Sparx Enterprise Architect– Corporate Edition, Standard License– www.sparxsystems.com.au

Page 29: Tracking the Enterprise Data Landscape Todd Sicard DAMA-MN May 19, 2010 © 2010 Blue Cross and Blue Shield of Minnesota

29

Thank you!

[email protected]