is it sensible to use data vault at all? conclusions from a project
TRANSCRIPT
Is it sensible to use Data Vault at all? Conclusions from a project.
Mainz, 15th March 2016 11. Oracle DWH Community Treffen Alexander Mendle (Insights & Data, Capgemini)
Our customers’ business model. Project setup.
Green field DW with 3 source systems – ERP, CMS, and a transaction system. Data Vault was preset by our customers group who also provided a Data Vault architect.
Capgemini supported the project in implementation and testing.
… and being part of a multi-faceted group …
180,000 employees(1) in more than 40 countries
A promise that expresses
our brand philosophy
Revenues(2)
€10.573 billion
Operating margin
€486 million
Operating profit
€447 million
Net cash and
cash equivalents
€1,464 million
6 strategic alliances
EMC2, HP, IBM, Microsoft,
Oracle, SAP
7 values shared since the company’s creation in 1967
honesty/boldness/ trust/freedom/ team spirit/modesty/fun
A wide range of cutting-edge expertise for all
our clients
Five strategic sectors
Expertise in
Automotive ,Banking
Consumer Products & Retail
Energy and Utilities Insurance
A unique way
1 Headcount including IGATE 2 For the FY15-16
Capgemini Insights & Data Services Model
Copyright © Capgemini 2013. All Rights Reserved
5 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Industry
verticalization
Automotive
Consumer Products
& Retail
Public Sector
Financial Services
Telco
Energy & Utilities
Life Sciences
Media &
Entertainment
Core capabilities and offers
Data & Info
Management
Master Data
Management
Big Data
(Hadoop/NoSQL)
Optimized data
warehouse
EPM (Enterprise Performance
Management)
BI & Data
Visualization
Predictive +
data science
Real-time
analytics
Delivery
models
BI Service Center
Cloud
Application
management
Agile
as-a-service &
BPO
IP Solutions
Rapid prototyping/
POC
Business engagement
Strategic
customer
partnership
Digital
transformation
Specific
alliance
initiatives
Performance
management
& strategy
Risk sharing
Governance, architecture and strategy
Data and
information
governance
Architecture Privacy &
security
Information
strategy
Business
Data Lake
Copyright © Capgemini 2013. All Rights Reserved
6 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
What is Data Vault? – just a quick glimpse
Impact on Architecture
Impact on Implementation
Impact on Project
Summary
Data Vault is applicable in a multi-layer DWH architecture
Copyright © Capgemini 2013. All Rights Reserved
7 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Stage
Data Mart(s)
Core (mostly 3NF)
Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
Data Vault is mostly tied to its unique data modelling approach. However, as of its newest version it’s a comprehensive set of data modelling, project methodology and system architecture.
Stage
Data Mart(s)
Data Vault
Hub: List of buiness keys.
Link: N:M relations
between Hubs.
Satellites: details for
Hubs & Links (historized)
The Data Vault proposition: agile, quick and cheap
Copyright © Capgemini 2013. All Rights Reserved
8 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
That’s the proposed enhancements...
Source: Linstedt, Olschimke: “Experiences from a Data Vault 2.0 pilot”, 13th European TDWI Conference,
Munich, 2013
Reduction in Total Cost of Ownership More agility
• Supports cross-functional areas of business
• Near zero change impacts to existing system
• Reduction in data acquisition costs
• Reduction in maintenance costs over the life of the EDW
• Reduction in implementation complexitiy
• Compliance with full audits (“all the data, all of the time”)
• Rapid turn-around for new requirements
• Parallel teams – all agile
• Scalable teams – with limited ramp-up necessary
• Automated ETL generation based on patterns.
Copyright © Capgemini 2013. All Rights Reserved
9 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
What is Data Vault?
Impact on Architecture
Impact on Implementation
Impact on Project
Summary
Architecture easily deals with changes
Copyright © Capgemini 2013. All Rights Reserved
10 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Existing structures are not changed – no matter if its just a new attribute or a whole new source system, changes in business or in semantics. New things are just added without side effects.
Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
Architecture easily deals with changes
Copyright © Capgemini 2013. All Rights Reserved
11 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Existing structures are not changed – no matter if its just a new attribute or a whole new source system, changes in business or in semantics. New things are just added without side effects.
Source: Linstedt: “Super Charge Your Data Warehouse”, o. V., o. O., ISBN: 978-0-9866757-1-3, 2010
Copyright © Capgemini 2013. All Rights Reserved
12 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
What is Data Vault?
Impact on Architecture
Impact on Implementation
Impact on Project
Summary
Unique construction of tables enables industrialization
Copyright © Capgemini 2013. All Rights Reserved
13 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Hub tables
Unique “list” of business keys
And the scheduling: first the hubs, then hub satellites and links, then link satellites. No more dependencies. With Data Vault 2 this became even more flexible.
Link tables Unique list of business
key combinations
Hub & Link satellite tables
Attributes belonging to
hub / link entity
Nu
mb
er
of m
an
ua
l E
TL
s
Nu
mb
er
of ta
ble
s
Effects of Data Vault on implementation
Copyright © Capgemini 2013. All Rights Reserved
14 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Quick implementation of Stage and Core using Generators – must-be as there is a vast number of objects
Data Vault requires a huge number of objects, but allows highly industrialized implementation
Stage | CORE | Marts Stage | CORE | Marts
Layers
Stage Layer
Data Vault Layer
Data Marts
Considerations from an implementation viewpoint Does your staff support that?
Copyright © Capgemini 2013. All Rights Reserved
15 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
What will happen when implementing.
This is business as usual.
You will need a strong architect who masters Data Vault, and also advocates Data Vault against developers not familiar with Data Vault. Be prepared for arguments.
Developers will use scripts, generators, configurations and will not build manual ETLs
Most time will be spent programming generators, a build toolchain, test automation
You might do here: 1) integrate data into context, 2) homogenize data and 3) realize the data mart requirements. Probably all at a time.
Watch out that you do not build things over and over again. So be sure to have a good understanding of the semantics of your business model in the team. Use Business Vault or other helpers.
Developers will implement lots of joins in their queries.
A generator-based DW requires a more software-development oriented team – that’s probably a slightly different story than a “BI consultant” team. Do you have the team that bridges the gap?
Copyright © Capgemini 2013. All Rights Reserved
16 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
What is Data Vault?
Impact on Architecture
Impact on Implementation
Impact on Project
Summary
Process for building Data Vault is straight forward, the CORE can be built quickly...
Copyright © Capgemini 2013. All Rights Reserved
17 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Analysis of business as usual....
...but not much efforts for design needed – Data Vault rules for modelling apply.
ETLs can be generated from only four different templates (Hubs, Links, Hub Satellites and Link Satellites)
Data Vault can help you pick up pace with the CORE (raw Data Vault)
Analysis Design ETL generation Loading
The thing is not to easily store lots of things, it’s about how to retrieve information from it.
Copyright © Capgemini 2013. All Rights Reserved
18 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Data Vault is a paradigm change in many ways.
Quick and large store. „All the data – all of the time“
Needs experience for retrieval
High effort for systematic Semantics built-in
...but may thwart you building the marts
Copyright © Capgemini 2013. All Rights Reserved
19 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Data Vault may help you pick up pace – up to the CORE layer (i. e. “DataVault Layer”)
Analysis of business as usual but
....no consideration of a “CORE” modelling because all Data Vault rules apply
“CORE” is done, but still
Lacking data homogenisation?
“business entities” (n:m)?
probably the proper analysis of all that?
On top of that, you need to (technically) design your datamart
Implementing all these complex rules in ETL
Quick setup of CORE model
Quick implementation for STAGE and CORE --generated and automated build, test, rollout
faster slower
That was our plan looking at it in november
Copyright © Capgemini 2013. All Rights Reserved
20 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Aug Timeline Sep Oct Nov Dec Jan Feb Mar
Phase 2 Phase Phase 1
Phase 2 Deliverables Phase 1 Deliverables
Mappings and workflows for
STAGE
RAW DATA VAULT
Mappings and workflows for
MASTER and MASTER CHECK
BUSINESS VAULT
MART
Deliverables
Staging and Raw DataVault went quite fast. Business rules AND requirements are implemented within Data Mart, whereas CORE can be generated!
Make use of helpers such as Business Vault.
Not the DW – the toolchain software is the deliverable
Copyright © Capgemini 2013. All Rights Reserved
21 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Do you have a need to write / request a special type of offer? Consider buying / offering a generator tool chain instead of tables and ETL programs.
Customer Service Provider
Consider writing your next RfP not as RfP for a DW – but as RfP for a generator software.
Obtain control over the build tools in your project – the result is reproducible.
Consider a bid offering the generator software. Your offer might look astonishingly compelling.
Give your customer a good argumentation why to go for a generated solution – and refrain from it if you think there is no fit.
Copyright © Capgemini 2013. All Rights Reserved
22 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
Agenda
What is Data Vault?
Impact on Architecture
Impact on Implementation
Impact on Project
Summary
Layers
Do I have a strong need to enable agile methodologies?
Do I have the right people to support that?
Is there a strong need for any special Data Vault characteristic?
Summary Putting these aspects together with a decision focus
Copyright © Capgemini 2013. All Rights Reserved
23 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
What will happen when implementing.
Data Vault enables you to integrate new source systems quickly and realize new requirements with minimal dependencies.
Data Vault requires somewhat different skills than a classic BI project
Data Vault is different in analysis and operation than a classic BI environment.
Are you able to bring a better understanding of your business into the (BI) team?
Are you really in need to be highly agile on CORE level?
Do you really need to have high traceability and / or auditability?
Which other possibilities do you have to realize your requirements?
How is my budget and time situation?
Do you have a small, probably volatile and for future phases not overseeable budget for your BI initiative?
Are you in need to quickly obtain a consolidated and flexible data layer (CORE-DW vs Data Vault)
Operative: Data Vault allows for automation, puts analysis in two points of architecture, allows agility Tactic: Data Vault can help you get as much budget through the door as there is.
Strategic: Service Providers can build new business models with Data Vault – for CORE layer
Contact information
Alexander
Mendle Consultant Insights & Data
Capgemini Deutschland GmbH Olof-Palme-Str. 14
81829 München
Insert
contact
picture
Copyright © Capgemini 2013. All Rights Reserved
24 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
www.capgemini.com
About Capgemini
With more than 120,000 people in 40 countries, Capgemini is one
of the world's foremost providers of consulting, technology and
outsourcing services. The Group reported 2011 global revenues
of EUR 9.7 billion.
Together with its clients, Capgemini creates and delivers
business and technology solutions that fit their needs and drive
the results they want. A deeply multicultural organization,
Capgemini has developed its own way of working, the
Collaborative Business ExperienceTM, and draws on Rightshore ®,
its worldwide delivery model.
Rightshore® is a trademark belonging to Capgemini
The information contained in this presentation is proprietary.
Copyright © 2013 Capgemini. All rights reserved.
Just a few Data Vault Tools
Copyright © Capgemini 2013. All Rights Reserved
26 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
• Example: quipu – http://www.datawarehousemanagement.org/
• An engine to play around: https://sourceforge.net/projects/pdidatavaultfw/ (Linux, MySQL, Kettle, Excel-configurated)
Pictures
Copyright © Capgemini 2013. All Rights Reserved
27 DATAVAULT IST DER EINSATZ VON DATA VAULT SINNVOLL-V10.PPTX
• https://commons.wikimedia.org/wiki/File:Gao-report-on-interchange.gif, 13.3.16, Public domain
• https://commons.wikimedia.org/wiki/File:KUKA_robot_for_flat_glas_handling.jpg, 9.3.16, Public domain
• https://pixabay.com/de/b%C3%BCro-ordner-regal-fenster-firma-638247/, 9.3.16, Public domain
• https://www.flickr.com/photos/nationalsecurityzone/8552562622/in/photostream/, 9.3.16, https://creativecommons.org/licenses/by/2.0/, By: MedillNSZ (https://www.flickr.com/photos/nationalsecurityzone/)
• https://commons.wikimedia.org/wiki/File:2010.07.21.152950_Abf%C3%BCllanlage_Gerolstein.jpg, 9.3.16, By Hermann Luyken (Own work) [Public domain], via Wikimedia Commons
• https://commons.wikimedia.org/wiki/File:Euplectes_progne_male_South_Africa_cropped.jpg, 9.3.16, Public domain
Refer to these Websites for more information.