corporate data architecture in a federated world presented by deborah henderson, inergi lp to irmac...

46
Corporate Data Architecture in a Federated World Presented by Deborah Henderson, INERGI LP to IRMAC Business Intelligence & Data Warehouse SIG May 23 2002

Post on 18-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Corporate Data Architecture in a Federated World

Presented byDeborah Henderson, INERGI LPto IRMAC Business Intelligence & Data Warehouse SIG

May 23 2002

© 2002 Inergi LP - All rights reserved 24/2/2002

We know that the data must be in thewarehouse somewhere, but we can’t find it.

You Know You have a Problem When…...

You Have a 'Dark Matter Schema'

© 2002 Inergi LP - All rights reserved 34/2/2002

Two subsets, according to differing theories:

The WIMPs schema:we are just too overwhelmed by user requeststo track down where in the data that particular element resides.

The MACHOs schema:we are too busy being the all knowing DW experts to track the elements down.

The 'Dark Matter Schema'

© 2002 Inergi LP - All rights reserved 44/2/2002

Agenda

Inergi and Architecture Corporate Data Architecture What supports this : the IT Business Model

and Architectural Compliance Data Architecture Process and Procedure Models & modelling On the Horizon

© 2002 Inergi LP - All rights reserved 54/2/2002

INERGI LP Subsidiary of Cap Gemini Ernst & Young Canada

Inc. along with New Horizons Solutions, another energy sector affiliate reporting into CGE&Y

Created March 1 2002 Multi-year deal for sustainment of Hydro One IT

systems Supply, Finance, Pay, IT, Call Centre, Customer

billing for Hydro One We are IT and Business process outsourcing

specialists for the energy sector with many years experience

We are open for business!

© 2002 Inergi LP - All rights reserved 64/2/2002

Corporate Data Architecture

© 2002 Inergi LP - All rights reserved 74/2/2002

Enterprise IT Architecture

Why is it Important? Reduces cost of operations, through reuse of standard pieces of technology, application and data, network

Example: Financials

– disk farm participant (technology)– official one source for ledger data (data)– Java reporting environment (application)– using network standards

© 2002 Inergi LP - All rights reserved 84/2/2002

Corporate Architecture

P h ys ica l D a ta A rch ite c tu re M e ta-d a ta A rch itec tu re D a ta A rch ite c tu re

D a ta A p p lica tion N e tw o rk T e ch no lo gy

C o rp ora te A rch ite c tu re

© 2002 Inergi LP - All rights reserved 94/2/2002

Data Architecture Idea

Data Architecture Set of principles that defines ‘organization-wide data resource of well-described, properly structured, high-quality data that are properly documented’. (Brackett, 1994)

Metadata Architecture Set of principles that defines and describes the data resources in an organization.

Physical DBMS Architecture Architecture component that defines physical data components.

© 2002 Inergi LP - All rights reserved 104/2/2002

Data Architecture Constructs

D a ta P a rtitio n ing D a ta P lace m e nt D a ta U se a ge

P h ys ica l, M e ta d a ta , D a ta A rch itec tu res

© 2002 Inergi LP - All rights reserved 114/2/2002

Data Architecture Should be Principles Based

• Try to leverage the DW Project : Data, metadata, technical, ETL (extract transform and load) , EUT (end use tools), Physical architectures

• Develop architecture changes and additions to the overall Enterprise Architecture

• High re-use of data and processes across the enterprise for next initiatives

© 2002 Inergi LP - All rights reserved 124/2/2002

EXAMPLES:1. System of record will be established for all data

2. Corporate definitions of data will be resolved and maintained

3. Data will reside on database servers not on application servers or mail servers

A Data Architecture is Principles Based

© 2002 Inergi LP - All rights reserved 134/2/2002

Enterprise Data Warehouse as Driver for Data Architecture

• First data architecture effort often constrained by the EDW project

• Think enterprise scalable: Hardware, Software, Processes, Centre of Excellence in Data, Corporate Data Architecture compliance, stewardship & vitality process

© 2002 Inergi LP - All rights reserved 144/2/2002

Data Architecture Implemented: Tools & Expertise

• Modelling & Metadata storage • Data store • Data mining • Statistics• Business Reporting • Environment modelling

• keep the number of tools to a minimum reduces lifetime ownership costs to the Company• establish the role of Product Specialist for all tools

© 2002 Inergi LP - All rights reserved 154/2/2002

Datamart Local model

OLAP & Analysis

Historical &Benchmark

purchased data

Operational Report server

Sample DW Data Architecture

Local Data

OLAP & Details

External Data & History

ODS source

10%

20%

30%

40%

© 2002 Inergi LP - All rights reserved 164/2/2002

Local model

OLAP & Analysis

External data,history

ODS

There are models for each component

Sometimes!! the models are linked/related

Sample DW Data Architecture

© 2002 Inergi LP - All rights reserved 174/2/2002

Metadata Architecture in Most Companies Today

ModellingCASE

RDBMS

ETL

BIEnd Use Tool

modeler

Interfaces are usually proprietary

= repository

End user

OLTP Applications

© 2002 Inergi LP - All rights reserved 184/2/2002

Metadata Arch

The Object Management Group (OMG) was established in 1989 and is the world's largest software consortium with a membership of over 700 vendors, developers, and end users.

In June 2000, OMG released an XML-based metadata standard.

OMG showcased XML metadata interchange in March 2001 atthe DAMAI conference in Anaheim

© 2002 Inergi LP - All rights reserved 194/2/2002

Physical DB Architecture

• ORACLE and DB2 outlook• Impact of DW features • partitions, instances and machines

• Referential integrity• placement and enforcement

• Multi-dimensional cubes

• Metadata ‘repository’ through hooks or??

© 2002 Inergi LP - All rights reserved 204/2/2002

Business Intelligence: The Delivery Maturity Model

InfrastructureDevelopment

Document- centricportal

DW /BIsupportportal

E-businessportal

© 2002 Inergi LP - All rights reserved 214/2/2002

Processes and Procedures

© 2002 Inergi LP - All rights reserved 224/2/2002

IT Compliance Business Model

For every IT project Technical architecture Data architecture Map to Business model

pre-requisite is aPolicy on data sharing

© 2002 Inergi LP - All rights reserved 234/2/2002

Data Architecture Compliance Process

Modelling principles and procedures

Naming Conventions

Database design and implementation guidelines

Policies

Metadata principles

Data architecture principles

© 2002 Inergi LP - All rights reserved 244/2/2002

Data Architecture Compliance Process

Document processes that driveprocedures, standards*

• Data Models are necessary deliverable of project, should be noted in Charter asa deliverable

• Architectural compliance and operational readiness gate

• Vitality process - keeping current

© 2002 Inergi LP - All rights reserved 254/2/2002

*Additional Documentation to Support the Corporate Data Architecture

Metadata repository (or facsimile!) Business definitions Standard naming conventions Standard abbreviating procedures Standard domain structuring Standard translation schemes Conformed dimensions and facts - EDW Stewardship patterns CDA roles and responsibilities

© 2002 Inergi LP - All rights reserved 264/2/2002

Governance and Using the Architecture

Governance framework and repository maintenance

Using the Architecture in a project

© 2002 Inergi LP - All rights reserved 274/2/2002

• Data will be controlled and managed throughout its life cycle as a resource, in the same manner as any other asset (capital, material, and people).

• Access to data will be facilitated, and/or controlled and limited, as required to provide the best performance at the least cost for all users while meeting functional and technological , regulatory and legal requirements.

Data Stewardship: the Business-side Responsibility

© 2002 Inergi LP - All rights reserved 284/2/2002

• Data will be shared except where exempted by Corporate Security Policy.

• Data will be standardized to avoid duplication and facilitate integration.

Data Stewardship: the Business side responsibility cont’d

© 2002 Inergi LP - All rights reserved 294/2/2002

Building Models1. Assess subject areas involved in a project and publish for reuse :

• Conceptual model bubbles

• Subject area models where complete

• Documentation standards for models

© 2002 Inergi LP - All rights reserved 304/2/2002

Building Models

2. Build on these models and submit for review and approval

3. Develop conformed data objects where required (as you go)

3. Add new models to the Model Repository

REFER TO PROCESS DOCUMENTATION !

© 2002 Inergi LP - All rights reserved 314/2/2002

Related Data Models ARE Your Quality Control

Conceptual and Enterprise Data Models maintained by IT Architecture

Logical Models

assists in understanding, official definitions,

(OLTP physical = logical) Enterprise Data Warehouse Model

a dimensional model that gets implemented in Oracle

high-performance ‘read-only’ model Cube Designs - Problem centric Dimensional models

implemented in OLAP Cubes Source Data Models (OLTP) - informational for BI,

source for ODS

© 2002 Inergi LP - All rights reserved 324/2/2002

Models

© 2002 Inergi LP - All rights reserved 334/2/2002

More Physics of Schemas :-)

'Black Hole Schema'

: Systems where the query never returns

'Pulsar Schema'

: Only returns results every few queries or so

'Milky Way Schema'

: A central warehouse with many dozens of offspringsthat no one can keep track of

'SuperStrings Schema'

: Many measures, all built on top of each other, relatingto each other and that give the same result

© 2002 Inergi LP - All rights reserved 344/2/2002

Conceptual Data Model

High-level model

Depiction of major Functional Areas in the Company

Each Functional Area defined

© 2002 Inergi LP - All rights reserved 354/2/2002

Enterprise Data Model

Limited number of high-level data sets (Subject Areas)

Global relationship cardinalities are shown

Data Sets fully defined Definition is formalized

at the Corporate level as official

Data Stewardship is established for each Subject Area

© 2002 Inergi LP - All rights reserved 364/2/2002

Logical Data Models

Logical Data Model developed per Project basis

LDM fully synchronized using

the Corporate Data Architecture Principles

Objects fully defined and

attributed Re-usable domains

implemented Re-usable rules identified,

documented and implemented consistently

Municipality_Type

Ministry_of_Environment_Distri

Work_Order_Type

Chemical

Disposal_Contractor

Hazard_Class

LAR_Fact_RemediationLAR_Factless_Fact

Work_Order

LAR_Fact_Budget_Actual

Business_Unit

Site

© 2002 Inergi LP - All rights reserved 374/2/2002

DW Physical Data Models

Implemented Limited use of RI (load

only) to keep the data integrity

Business rules implemented through ETL procedures

Model in-sync with the database

CASE tool used for model/database synchronization

Colors extensively used for better readability

Municipality_Type

Municipality_CD

Municipality_Description

Ministry_of_Environment_Distri

MOE_Region_CDMOE_District_CD

MOE_Region_DSCMOE_District_CD_DSC

Work_Order_Type

Work_Order_Type_CD

Work_Order_Type_CD_DSC

Chemical

Chemical_CD

Chemical_NMChemical_Measurement_CDChemical_Measurement_DSCChemical_Measurement_LimitChemical_Suite_CDChemical_Suit_DSCChemical_Suite_Measurement_CDMeasurement_DSCChemical_Suite_LimitChemical_Suite_Measurement_AMT

Disposal_Contractor

Disposal_Contractor_ID

Disposal_Contractor_NMHauling_Company_NMDisposal_Facility_NMCertificate_of_Approval

Hazard_Class

Hazard_Class_CD

Hazard_Class_Code_DSCHazard_Clas_CD_Measurement_AMTHazard_Class_CD_Measurement_CDHazard_Class_CD_Measure_CD_DSC

LAR_Fact_Remediation

Work _Order_NUM (FK )

Dis posa l _Contra c tor_ID (FK)Start_Date

Ha za rd_Cla s s_CD (FK)

Site _ID (FK )Remediation_DSCEnd_DateDisposed_Waste_Measure_DSCDisposed_Waste_QTY

LAR_Factless_Fact

Work _Order_NUM (FK )

Chemi c a l_CD (FK )Testing_Begin_DT

Work _Order_Type _CD (FK )

Site _ID (FK )

Busi ness _Unit (FK )Testing_End_DTChemical_Concentration_AMTAbove_Guideline_INDTesting_Status

Work_Order

Work_Order_NUM

Work_Order_DSCBegin_DTEnd_DTResource_CDResource_Code_DSCHand_Off_DTWork_Program_NumberWork_Program_Start_DTWork_Program_End_DTWork_Program_TypeWork_Type_DSCProject_CDProject_DSC

LAR_Fact_Budget_Actual

Work _Order_NUM (FK )

Munic ipa lity_CD (FK )

MOE_Region_CD (FK)

MOE_Distric t_CD (FK)

Work _Order_Type _CD (FK )

Busi ness _Unit (FK )

Site _ID (FK )

Budget_Credit_AMTBudget_Debit_AMTActual_Total_Credit_AMTActual_Total_Debit_AMT

Business_Unit

Business_Unit

Site

Site_ID

Site_NMGPS_North_ReadingGPS_West_ReadingGPS_DayGPS_TimeCRA_Tier_Ranking_INDCRA_Tier_Ranking_IND_DSCCRA_Ranking_INDCRA_Ranking_IND_DSCHydro_One_Ranking_INDHydro_One_Ranking_IND_DSC

© 2002 Inergi LP - All rights reserved 384/2/2002

: Has many dimensions, but you if you ask for more than a certain number at a time, it converts to a Black Hole Schema

Time for…...More Physics of Schemas :-)

'Binary System Schema'

: Two datamarts that do the same thing and try to suckeach other into themselves

’Chance Theory Schema'

: The results are always uncertain and questionable asit changes every time you run the report

‘Event Horizon Schema'

© 2002 Inergi LP - All rights reserved 394/2/2002

: A data warehouse that we miraculously brought into existence and the user does not know why or how or how it’s useful

'Big Bang Schema'

: Miraculously fast and becomes faster as you add data.But cannot be implemented as it is theoretical.

Most demos fall into this space.

'Tachyon Schema'

© 2002 Inergi LP - All rights reserved 404/2/2002

1 2 3 n

OLAPEnterprise

Data WarehouseModel

OLTPEnterpriseData Model

1 2

DATA MARTS

Logical Data models

Subject areaEXTRACTS

© 2002 Inergi LP - All rights reserved 414/2/2002

LDM 1 LDM2 LDM 3 LDM n

EnterpriseData Warehouse

Model

EnterpriseData Model

Extract Extract

© 2002 Inergi LP - All rights reserved 424/2/2002

For DW Conformed Dimensions & Facts

Supports iterative/parallel build and alignswith Kimball’s bus structure

BUT• Can get it wrong• Can loose control• Exponential complexity

© 2002 Inergi LP - All rights reserved 434/2/2002

• Impact of XML• Impact of Taxonomies in Business• Corporate Reporting Strategies• Overarching Mobile Data Strategies• ODS for all ERP• Information Architecture• BAM - Business Activity Monitoring*• Network Appliances• Synergies with Application Architecture

*Gartner April 2002

On the Horizon

© 2002 Inergi LP - All rights reserved 444/2/2002

CDA Together with Application Architecture

DATA APPL’N

• Business Processes should be implemented in the application not the database• Logical workflow and data flow must align• Applications must have owners just like data• We must be able to identify official source (aka system of record)

© 2002 Inergi LP - All rights reserved 454/2/2002

Corporate Data Architecture That is Process Driven

• Policies • Architecture & Standards• Addresses OLTP and OLAP “Federated” world• Compliance • Vitality• Stewardship• Can be applied to all new challenges on the horizon

With all the pieces….This could work!!

© 2002 Inergi LP - All rights reserved 464/2/2002

‘Enron Schema’

: Shows positive numbers where we practically expectnegative values, and Anderson can prove that it is

correct.