is it sensible to use data vault at all? conclusions from a project

27
Is it sensible to use Data Vault at all? Conclusions from a project. Mainz, 15 th March 2016 11. Oracle DWH Community Treffen Alexander Mendle (Insights & Data, Capgemini)

Upload: capgemini

Post on 16-Apr-2017

1.661 views

Category:

Technology


1 download

TRANSCRIPT

Is it sensible to use Data Vault at all? Conclusions from a project.

Mainz, 15th March 2016 11. Oracle DWH Community Treffen Alexander Mendle (Insights & Data, Capgemini)

Our customers’ business model. Project setup.

Green field DW with 3 source systems – ERP, CMS, and a transaction system. Data Vault was preset by our customers group who also provided a Data Vault architect.

Capgemini supported the project in implementation and testing.

With over 11,000 professionals across 40+ countries …

… and being part of a multi-faceted group …

180,000 employees(1) in more than 40 countries

A promise that expresses

our brand philosophy

Revenues(2)

€10.573 billion

Operating margin

€486 million

Operating profit

€447 million

Net cash and

cash equivalents

€1,464 million

6 strategic alliances

EMC2, HP, IBM, Microsoft,

Oracle, SAP

7 values shared since the company’s creation in 1967

honesty/boldness/ trust/freedom/ team spirit/modesty/fun

A wide range of cutting-edge expertise for all

our clients

Five strategic sectors

Expertise in

Automotive ,Banking

Consumer Products & Retail

Energy and Utilities Insurance

A unique way

1 Headcount including IGATE 2 For the FY15-16

Capgemini Insights & Data Services Model

Copyright © Capgemini 2013. All Rights Reserved

5 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Industry

verticalization

Automotive

Consumer Products

& Retail

Public Sector

Financial Services

Telco

Energy & Utilities

Life Sciences

Media &

Entertainment

Core capabilities and offers

Data & Info

Management

Master Data

Management

Big Data

(Hadoop/NoSQL)

Optimized data

warehouse

EPM (Enterprise Performance

Management)

BI & Data

Visualization

Predictive +

data science

Real-time

analytics

Delivery

models

BI Service Center

Cloud

Application

management

Agile

as-a-service &

BPO

IP Solutions

Rapid prototyping/

POC

Business engagement

Strategic

customer

partnership

Digital

transformation

Specific

alliance

initiatives

Performance

management

& strategy

Risk sharing

Governance, architecture and strategy

Data and

information

governance

Architecture Privacy &

security

Information

strategy

Business

Data Lake

Copyright © Capgemini 2013. All Rights Reserved

6 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Agenda

What is Data Vault? – just a quick glimpse

Impact on Architecture

Impact on Implementation

Impact on Project

Summary

Data Vault is applicable in a multi-layer DWH architecture

Copyright © Capgemini 2013. All Rights Reserved

7 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Stage

Data Mart(s)

Core (mostly 3NF)

Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010

Data Vault is mostly tied to its unique data modelling approach. However, as of its newest version it’s a comprehensive set of data modelling, project methodology and system architecture.

Stage

Data Mart(s)

Data Vault

Hub: List of buiness keys.

Link: N:M relations

between Hubs.

Satellites: details for

Hubs & Links (historized)

The Data Vault proposition: agile, quick and cheap

Copyright © Capgemini 2013. All Rights Reserved

8 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

That’s the proposed enhancements...

Source: Linstedt, Olschimke: “Experiences from a Data Vault 2.0 pilot”, 13th European TDWI Conference,

Munich, 2013

Reduction in Total Cost of Ownership More agility

• Supports cross-functional areas of business

• Near zero change impacts to existing system

• Reduction in data acquisition costs

• Reduction in maintenance costs over the life of the EDW

• Reduction in implementation complexitiy

• Compliance with full audits (“all the data, all of the time”)

• Rapid turn-around for new requirements

• Parallel teams – all agile

• Scalable teams – with limited ramp-up necessary

• Automated ETL generation based on patterns.

Copyright © Capgemini 2013. All Rights Reserved

9 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Agenda

What is Data Vault?

Impact on Architecture

Impact on Implementation

Impact on Project

Summary

Architecture easily deals with changes

Copyright © Capgemini 2013. All Rights Reserved

10 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Existing structures are not changed – no matter if its just a new attribute or a whole new source system, changes in business or in semantics. New things are just added without side effects.

Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010

Architecture easily deals with changes

Copyright © Capgemini 2013. All Rights Reserved

11 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Existing structures are not changed – no matter if its just a new attribute or a whole new source system, changes in business or in semantics. New things are just added without side effects.

Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010

Copyright © Capgemini 2013. All Rights Reserved

12 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Agenda

What is Data Vault?

Impact on Architecture

Impact on Implementation

Impact on Project

Summary

Unique construction of tables enables industrialization

Copyright © Capgemini 2013. All Rights Reserved

13 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Hub tables

Unique “list” of business keys

And the scheduling: first the hubs, then hub satellites and links, then link satellites. No more dependencies. With Data Vault 2 this became even more flexible.

Link tables Unique list of business

key combinations

Hub & Link satellite tables

Attributes belonging to

hub / link entity

Nu

mb

er

of m

an

ua

l E

TL

s

Nu

mb

er

of ta

ble

s

Effects of Data Vault on implementation

Copyright © Capgemini 2013. All Rights Reserved

14 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Quick implementation of Stage and Core using Generators – must-be as there is a vast number of objects

Data Vault requires a huge number of objects, but allows highly industrialized implementation

Stage | CORE | Marts Stage | CORE | Marts

Layers

Stage Layer

Data Vault Layer

Data Marts

Considerations from an implementation viewpoint Does your staff support that?

Copyright © Capgemini 2013. All Rights Reserved

15 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

What will happen when implementing.

This is business as usual.

You will need a strong architect who masters Data Vault, and also advocates Data Vault against developers not familiar with Data Vault. Be prepared for arguments.

Developers will use scripts, generators, configurations and will not build manual ETLs

Most time will be spent programming generators, a build toolchain, test automation

You might do here: 1) integrate data into context, 2) homogenize data and 3) realize the data mart requirements. Probably all at a time.

Watch out that you do not build things over and over again. So be sure to have a good understanding of the semantics of your business model in the team. Use Business Vault or other helpers.

Developers will implement lots of joins in their queries.

A generator-based DW requires a more software-development oriented team – that’s probably a slightly different story than a “BI consultant” team. Do you have the team that bridges the gap?

Copyright © Capgemini 2013. All Rights Reserved

16 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Agenda

What is Data Vault?

Impact on Architecture

Impact on Implementation

Impact on Project

Summary

Process for building Data Vault is straight forward, the CORE can be built quickly...

Copyright © Capgemini 2013. All Rights Reserved

17 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Analysis of business as usual....

...but not much efforts for design needed – Data Vault rules for modelling apply.

ETLs can be generated from only four different templates (Hubs, Links, Hub Satellites and Link Satellites)

Data Vault can help you pick up pace with the CORE (raw Data Vault)

Analysis Design ETL generation Loading

The thing is not to easily store lots of things, it’s about how to retrieve information from it.

Copyright © Capgemini 2013. All Rights Reserved

18 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Data Vault is a paradigm change in many ways.

Quick and large store. „All the data – all of the time“

Needs experience for retrieval

High effort for systematic Semantics built-in

...but may thwart you building the marts

Copyright © Capgemini 2013. All Rights Reserved

19 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Data Vault may help you pick up pace – up to the CORE layer (i. e. “DataVault Layer”)

Analysis of business as usual but

....no consideration of a “CORE” modelling because all Data Vault rules apply

“CORE” is done, but still

Lacking data homogenisation?

“business entities” (n:m)?

probably the proper analysis of all that?

On top of that, you need to (technically) design your datamart

Implementing all these complex rules in ETL

Quick setup of CORE model

Quick implementation for STAGE and CORE --generated and automated build, test, rollout

faster slower

That was our plan looking at it in november

Copyright © Capgemini 2013. All Rights Reserved

20 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Aug Timeline Sep Oct Nov Dec Jan Feb Mar

Phase 2 Phase Phase 1

Phase 2 Deliverables Phase 1 Deliverables

Mappings and workflows for

STAGE

RAW DATA VAULT

Mappings and workflows for

MASTER and MASTER CHECK

BUSINESS VAULT

MART

Deliverables

Staging and Raw DataVault went quite fast. Business rules AND requirements are implemented within Data Mart, whereas CORE can be generated!

Make use of helpers such as Business Vault.

Not the DW – the toolchain software is the deliverable

Copyright © Capgemini 2013. All Rights Reserved

21 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Do you have a need to write / request a special type of offer? Consider buying / offering a generator tool chain instead of tables and ETL programs.

Customer Service Provider

Consider writing your next RfP not as RfP for a DW – but as RfP for a generator software.

Obtain control over the build tools in your project – the result is reproducible.

Consider a bid offering the generator software. Your offer might look astonishingly compelling.

Give your customer a good argumentation why to go for a generated solution – and refrain from it if you think there is no fit.

Copyright © Capgemini 2013. All Rights Reserved

22 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

Agenda

What is Data Vault?

Impact on Architecture

Impact on Implementation

Impact on Project

Summary

Layers

Do I have a strong need to enable agile methodologies?

Do I have the right people to support that?

Is there a strong need for any special Data Vault characteristic?

Summary Putting these aspects together with a decision focus

Copyright © Capgemini 2013. All Rights Reserved

23 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

What will happen when implementing.

Data Vault enables you to integrate new source systems quickly and realize new requirements with minimal dependencies.

Data Vault requires somewhat different skills than a classic BI project

Data Vault is different in analysis and operation than a classic BI environment.

Are you able to bring a better understanding of your business into the (BI) team?

Are you really in need to be highly agile on CORE level?

Do you really need to have high traceability and / or auditability?

Which other possibilities do you have to realize your requirements?

How is my budget and time situation?

Do you have a small, probably volatile and for future phases not overseeable budget for your BI initiative?

Are you in need to quickly obtain a consolidated and flexible data layer (CORE-DW vs Data Vault)

Operative: Data Vault allows for automation, puts analysis in two points of architecture, allows agility Tactic: Data Vault can help you get as much budget through the door as there is.

Strategic: Service Providers can build new business models with Data Vault – for CORE layer

Contact information

Alexander

Mendle Consultant Insights & Data

[email protected]

Capgemini Deutschland GmbH Olof-Palme-Str. 14

81829 München

Insert

contact

picture

Copyright © Capgemini 2013. All Rights Reserved

24 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

www.capgemini.com

About Capgemini

With more than 120,000 people in 40 countries, Capgemini is one

of the world's foremost providers of consulting, technology and

outsourcing services. The Group reported 2011 global revenues

of EUR 9.7 billion.

Together with its clients, Capgemini creates and delivers

business and technology solutions that fit their needs and drive

the results they want. A deeply multicultural organization,

Capgemini has developed its own way of working, the

Collaborative Business ExperienceTM, and draws on Rightshore ®,

its worldwide delivery model.

Rightshore® is a trademark belonging to Capgemini

The information contained in this presentation is proprietary.

Copyright © 2013 Capgemini. All rights reserved.

Just a few Data Vault Tools

Copyright © Capgemini 2013. All Rights Reserved

26 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

• Example: quipu – http://www.datawarehousemanagement.org/

• An engine to play around: https://sourceforge.net/projects/pdidatavaultfw/ (Linux, MySQL, Kettle, Excel-configurated)

Pictures

Copyright © Capgemini 2013. All Rights Reserved

27 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX

• https://commons.wikimedia.org/wiki/File:Gao-report-on-interchange.gif, 13.3.16, Public domain

• https://commons.wikimedia.org/wiki/File:KUKA_robot_for_flat_glas_handling.jpg, 9.3.16, Public domain

• https://pixabay.com/de/b%C3%BCro-ordner-regal-fenster-firma-638247/, 9.3.16, Public domain

• https://www.flickr.com/photos/nationalsecurityzone/8552562622/in/photostream/, 9.3.16, https://creativecommons.org/licenses/by/2.0/, By: MedillNSZ (https://www.flickr.com/photos/nationalsecurityzone/)

• https://commons.wikimedia.org/wiki/File:2010.07.21.152950_Abf%C3%BCllanlage_Gerolstein.jpg, 9.3.16, By Hermann Luyken (Own work) [Public domain], via Wikimedia Commons

• https://commons.wikimedia.org/wiki/File:Euplectes_progne_male_South_Africa_cropped.jpg, 9.3.16, Public domain

Refer to these Websites for more information.