data warehouse on a budget how to really do more with less · pdf filehow to really do more...

45
Data Warehouse on a Budget How to really do more with less Discussion October 11, 2009

Upload: buidan

Post on 28-Mar-2018

221 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

Data Warehouse on a Budget

How to really do more with less

DiscussionOctober 11, 2009

Page 2: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

2Confidential and Proprietary

Agenda

• Essentials (foundation products)

• Key components (design build products)

• Inexpensive, high quality substitutes

• A word about accelerators How to deliver value quickly

• Other options Alternatives to mainstream thought

Page 3: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

3Confidential and Proprietary

About me…

• Led design/build of seven (7) large scale (over 5TB) data warehouses designed, built, and deployed in the Financial Services, Transportation, Supply Chain, Retail, Utility, and Professional Services industries

• Over twenty-five (25) data marts (special purpose subject areas) designed, built, and deployed across a wide variety of industries

• Eight (8) data warehouse executive assessments prepared and delivered for management review and action. In addition, five (5) detailed business cases prepared to support the investment in the analytic environment

• Five (5) commercial off-the-shelf products developed and marketed worldwide to the Software Engineering and Healthcare industries

Page 4: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

4Confidential and Proprietary

Essentials Foundation components

Page 5: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

5Confidential and Proprietary

Essentials – Fundamental Pattern

Page 6: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

6Confidential and Proprietary

Essentials – Full Deployment

Page 7: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

7Confidential and Proprietary

Essentials - Federated Environments

Common Staging Area

SAP FinancialsClaims Processing CRM 3rd Party

Federated Marketing Data

Warehouse

Real Time Data Mining and Analytics

Real Time Segmentation, Classification,Qualification,

Offerings, etc.

Federated Financial Data

Warehouse

Analytical Applications

Federated Claims

Processing Data Marts

Subset Data Marts

Real Time ODS

E-Commerce

Actuarial Data

Warehouse

Warranty Analysis

AWARE

Federated Meta Data Repositories

Page 8: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

8Confidential and Proprietary

Essentials – Common Information Model

Page 9: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

9Confidential and Proprietary

Essentials – EII

Page 10: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

10Confidential and Proprietary

Design Build Components

Page 11: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

11Confidential and Proprietary

A closer look…

Page 12: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

12Confidential and Proprietary

Key Processes

Page 13: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

13Confidential and Proprietary

Test Automation Strategy

Test Autom ation Strategy Unit B asics

Test Fixture S trategy

Fresh Fixture Patterns Shared F ixture Patterns

Result Verification Patterns Fixture Tear D ow n Patterns

Recorded Test Scripted Test Data Driven Test

Test Autom ation Fram ework

Fresh F ixture

Standard Fixture

Shared F ixture

Im m utableP re-built

Inline Setup Delegated Setup Im plic it Setup

Creation M ethod O bjec t M ethod

State Verification B ehavior Verification

Expected O bjec t

G uard A ssertion Custom Assertion Verification M ethod Delta Assertion

In line Tear Down Im plic it Tear Down

G arbage Collected Tear

D o w nAutom ated Tear Down

Lazy F ixture Setup Suite Fixture S etup Decora ted Setup Chained Test

F inder M ethod F ixture Regis tryDelta Assertion

Test Autom ation

Fram eworkTest Runner Test Case Object

Test D iscovery Test Enum eration Test Selection

Test Case Object Test M ethod Assertion M ethod

Test Case C lass Four P hase Test A ssertion M essage

Test Execution

Test D efin ition

Construction

Access

Page 14: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

14Confidential and Proprietary

Test Automation Strategy - Realized

Page 15: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

15Confidential and Proprietary

Data Quality

Data Quality Process

Measure

Analyze

Standardize

Correct

Enhance

Match

Consolidate

Report

Normalize data values andformats according to business

rules and third-party

references

Verify, scrub, and

appends data based upon

algorithms, business rules

provided from a

secondary source

Append additional data

enhancing the

information value

Identify duplicate

records within multiple

tables, databases

Combine unique data

elements from matched

records into a single

source

Provide reporting within

the data quality process

Quantifies the number

and types of defects

Assess the nature and

cause of the defects

Data Profiling

Data Cleansing

Data Enhancement

Match and Consolidate

Management Reporting and Oversight

ParseIsolate and identify

data elements in data

structures

Page 16: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

16

Data Quality – Why it is needed

SQL Server 2008 Data Profiling Task in Integration Services

Page 17: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

17Confidential and Proprietary

ChoicesEnabling Technologies

Page 18: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

18Confidential and Proprietary

• ORACLE, SAP, IBM, Informatica

– Powerful

– Expensive

– Demands high skill levels to deploy successfully

• Microsoft

– Good, well rounded general purpose platform

– Missing key management and meta-data elements

• Open Source (Pentaho, Jaspersoft, and Infobright)

– Validated the market for open source BI reporting and ETL tools

– Good, special purpose tools in the right hands (Talend)

• Alternatives

– Wherescape RED

– Special Purpose Tools (SeeWhy, Pervasive)

The choices

Page 19: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

19Confidential and Proprietary

• Labor intensive

• Subject to Vendor Driven Architecture (VDA)

• Expensive (maintenance, hidden support costs)?

• Missing critical management components

• Customization and development costs

• Meet organizational capability and align with objectives

– Expensive and time consuming if not

– JAVA or .NET

– UNIX or Microsoft

• Technical debt

– Quick and dirty is expensive

– Should invest more heavily in design

Total Cost of Ownership

Page 20: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

20Confidential and Proprietary

• Replace …

– AIX with Linux

– Websphere with JBOSS

– Domino with Alfresco or Drupal (ECM)

– Cognos with Pentaho

– Tivoli Monitoring with Hyperic

– Tivoli Netview with Zenoss

– Tivoli (Netcool) with OpenNMS

– Tivoli Configuration Manager with Puppet

– Tivoli Provisioning Manager with OpenQRM

How to save 10 million dollars

John Willis: IT Management and Cloud Bloghttp://www.johnmwillis.com/other/how-to-save-10-million-dollars-while-staring-into-the-abyss/

while staring into the abyss…

Page 21: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

21Confidential and Proprietary

• Most of our costs are in our people (4-5x)

– Development

– Support

– Maintenance

• Need for consistent, repeatable process controls

– Enable cost efficiency

– Deliver information products faster and less expensive

– Reduced complexity

– Component reuse

– Improved communication

• Leverage standardization benefits

– Less variance in work products

– Solve problems once

– Improved quality (defects caught earlier in cycle)

– Adopt standardized reference models, and templates

Seriously…

Page 22: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

22Confidential and Proprietary

• Open Source may not be so “Open”

– Align with internal skills and core competencies

• UNIX vs. Windows

• Java vs. .NET

• Perl vs. Powershell or WSH

• PHP vs. ASP

• Windows DW Stack may not be complete

– Management

– Metadata

– Flexibility

• Do not try to build a system whose complexity exceeds the organization's capabilities to deliver

Seriously…

Page 23: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

23Confidential and Proprietary

• Probably something in between

– Platform (don’t forget virtualization in development)

– Database and Storage Architecture

– Middleware

– Data Profiling and Quality Tools

– Configuration Management and ALM

– Test Automation and Continuous Integration

• Cruise Control

• NANT

• MAVEN

– Reporting and Information Delivery

• Reporting Services

• Excel (Server based – zero footprint)

What is the best solution on a budget?

Page 24: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

24Confidential and Proprietary

Inexpensive, high quality substitutesAlternatives to mainstream thought

Page 25: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

25Confidential and Proprietary

Zenoss Core - monitoring and systems management

Page 26: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

26Confidential and Proprietary

Puppet – Automated Systems Administration

Page 27: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

27Confidential and Proprietary

Subversion – Version Control

Page 28: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

28Confidential and Proprietary

Maven and Eclipse – Build and Manage Projects

Page 29: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

29Confidential and Proprietary

Pentaho (BI-Suite)

Page 30: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

30Confidential and Proprietary

Jaspersoft

Page 31: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

31Confidential and Proprietary

Talend

Page 32: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

32Confidential and Proprietary

INFOBright

Page 33: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

33Confidential and Proprietary

Protégé and the Essential Architecture Project

Page 34: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

34Confidential and Proprietary

DB Designer 4

Page 35: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

35Confidential and Proprietary

A word about accelerators

Page 36: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

36Confidential and Proprietary

Wherescape RED

Page 37: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

37Confidential and Proprietary

Wherescape RED

Page 38: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

38Confidential and Proprietary

MethodologyAlong the way…

Page 39: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

39Confidential and Proprietary

MIKE2.0 (Methodology)

Page 40: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

40Confidential and Proprietary

Comprehensive Process Models

Page 41: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

41Confidential and Proprietary

Self documenting

Page 42: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

Questions and reference links

• Wherescape REDhttp://www.wherescape.com/home/home.aspx

• Talendhttp://www.talend.com/index.php

• Essential Project

http://www.enterprise-architecture.org/

• Mike 2.0http://mike2.openmethodology.org/

• Pentaho BI Enterprise Suite

http://www.pentaho.com/

• nfoBrighthttp://www.infobright.com/InfoBright

Page 43: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

Questions and reference links

• JasperSofthttp://www.jaspersoft.com/

• John Willis: IT Management and Cloud Blog

http://www.johnmwillis.com/other/how-to-save-10-million-dollars-while-staring-into-the-abyss/

• Cruise Controlhttp://cruisecontrol.sourceforge.net/

• Maven

http://maven.apache.org/

• NANT

http://nant.sourceforge.net/

• Subversion, Puppet

http://subversion.tigris.org/, http://reductivelabs.com/trac/puppet/

Page 44: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

Data Warehouse on a Budget

How to really do more with less

Thank You…

Page 45: Data Warehouse on a Budget How to really do more with less · PDF fileHow to really do more with less Discussion ... Software Engineering and Healthcare ... Data Warehouse on a Budget

45Confidential and Proprietary

Mr. Parnitzke is a hands-on technology executive, trusted partner, advisor, software publisher, and widely recognized database management and enterprise architecture thought leader. Over his career he has served in executive, technical, publisher (commercial software), and practice management roles across a wide range of industries. Now a highly sought after technology management advisor and hands-on practitioner his customers include many of the Fortune 500 as well as emerging businesses where he is known for taking complex challenges and solving for them across all levels of the customer’s organization delivering distinctive value and lasting relationships.

Contact:[email protected]

Blogs:Applied Enterprise Architecture (pragmaticarchitect.wordpress.com)

The Corner Office (cornerofficeguy.wordpress.com)

Data management professional (jparnitzke.wordpress.com)

Essential Analytics (essentialanalytics.wordpress.com)

The program office (theprogramoffice.wordpress.com)

Data Warehouse on a Budget

How to really do more with less