big data modeling

Post on 14-Apr-2017

180 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BIG DATA MODELING

Hans Hultgren

RMDC Fall 2016

Welcome

• Big Data1

• Data Modeling2

• Big Data Modeling3

AGENDA

Session Objectives

• Big Data Fundamentals– Components of Big Data– Structure & Schemas– Tools & Architecture

• Data Modeling – Integration & History– Data Warehousing & BI– Conceptual to Physical

• Big Data Modeling– Focus on Meaning

• Ensemble Modeling– The Blended Architecture

BIG DATA

Big Data

“Huge” Data Volumes

n-Structured & Very Complex

Streaming & Shape-Shifting

Typical Data

v v

v v

v v

v v

Typical Data Big Data

A

B

C

Big Data

• VolumeHuge Volumes of Data

• VelocityDrinking from a Fire Hose

• Varietyn-Structured Data

• VeracityQuality, Accuracy, Reliability, Trustworthiness

• ValueBusiness Value and Value Potential

Big Data Architecture

• To deal with the features of Big Data, supporting architectural components are based on:

–Data distribution, and

– Late Binding of Schemas

KVP

Modeling and Understanding

• Schema on Write

• Schema on Read

• Dismantled Schema on Write

• Schema on Focus

• Schema on Leverage

9

LOAD

MODEL APPLYEXPLORE

Modeling and Understanding

• Big Data

Possibilities

10

LOADMODEL APPLY

EXPLORE

Inconvenient Truth about BIG DATA

http://community.embarcadero.com/blogs/entry/the-hidden-elephant-in-big-data-modeling

DATA MODELING

Data Modeling

Mans Search for Meaning…

• Conceptual Modeling

• Logical Modeling

• Information Modeling

• Physical Data Modeling

Ensemble Modeling™

14

All the parts of a thing taken together, so that

each part is considered only in relation to the whole.

• The constellation of component parts acts as a whole.

• With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. An Ensemble is typically based on all things defining a Core Business Concept that can be uniquely and specifically said for one instance of that Concept.

EMF

Forms of Modeling & Ensemble

15

Ensemble

Anchor Focal Point Data Vault

DV2.02G

Hyper AgilityTemporal6NF, etc.

Matter

EDW

DataMart

DataMart

DataMart

ERP

Acctg

Sales

3NF Dimensional

E M F

The Data Vault Ensemble

16

• The Data Vault Ensemble conforms to a single key – embodied in the Hub construct.

• The component parts for the Data Vault Ensemble include:

– Hub The Natural Business Key

– Link The Natural Business Relationships

– Satellite All Context, Descriptive Data and History

Ensemble means thinking differently

17

Customer

Customer

• The minimal construct then for an “entity”

such as “Customer” is now (in data vault) a

Hub with a set of Satellites

Applying data vault modeling pattern

18

Data Vault Ensemble Modeling Process

1) Identify and Model the Core Business Concepts

• Business Interviews is at the heart of this step

What do you do? What are the main things you work with?

• Find best/target Natural Business Key19

Data Vault Ensemble Modeling Process

2) Identify and Model the Natural Business Relationships

• Specific Unique Relationships

• Be considerate of the Unit of Work and Grain

20

Data Vault Ensemble Modeling Process

3) Analyze and Design the Context Satellites

• Consider Rate of Change, Type of Data and also the Sources

21

BIG DATA

MODELING

Logical business model

• Leveraged for all logical

model needs including

the data warehouse, big

data lake, master data

management (MDM) and

operational integration

initiatives

• Closely aligned to DV

physical model

Ensemble Logical Form ( )

23

Customer

Region Store

Sale

Vendor

Product

Sale LI

Employee

Customer

RegionStore

Sale

Vendor

Product

Sale LI

Employee

CustomerRegion

Store

Sale

Vendor

Product

Sale LI

Employee

Ensemble Logical Form

24

CustomerRegion

Store

Sale

Vendor

Product

Sale LI

Employee

ELF Modeling maintained in:

* Metadata

* Logical Data Model

* Data Modeling Tools

* Virtual Schemas

* Other Tools or Artifacts

Map to Context Data stored in:

* JSON Docs

* XML (w/ XSD or Not)

* Blobs (Free Form Text)

* Big Data Platforms

* Hadoop

* In the Cloud

Three Paths for Modeling

Structured / Known

• CBC

• NBR

• Attribution

• Columns

Results in a backbone model with attributes in defined columns

N-Structured / NVP

• CBC

• NBR

• Attribution

Results in a backbone modes with known/expected attribute names/tags

N-Structured / KVP

• CBC

• NBR

Results in a backbone model with capacity to capture unknown attribution either named/tagged or not

APPLYING THE ENSEMBLE

Integration

across

Platforms

Expanded Applications

CustomerRegion

Store

Sale

Vendor

Product

Sale LI

Employee

Summary

Ensemble in the Big Data World

• Conceptual Modeling

• Logical Modeling

• Information Modeling

• Physical Data Modeling

• Integration Platform

+++-+ + +

Links and Information

CDVDM Training & Certification

www.GeneseeAcademy.com

gohansgo

Hans@GeneseeAcademy.com

HansHultgren.WordPress.com

HansHultgren

Online, On-Demand Video Lessons

DataVaultAcademy.com

DataVaultAcademy

29

e-Book: Book:Modeling the Agile Data Warehouse with Data Vault Modeling the Agile Data Warehouse with Data Vault

top related