designing an agile fast data architecture for big data ecosystem using logical data warehouse and...

22
© 2016 Autodesk | Enterprise Information Services Designing an Agile Fast Data Architecture for Big Data Ecosystem using Logical Data Warehouse and Data Virtualization Kurt Jackson Autodesk Enterprise Information Services

Upload: denodo

Post on 16-Apr-2017

237 views

Category:

Technology


0 download

TRANSCRIPT

© 2016 Autodesk | Enterprise Information Services

Designing an Agile Fast Data Architecture for Big Data Ecosystem

using Logical Data Warehouse and Data Virtualization

Kurt Jackson

Autodesk Enterprise Information Services

© 2016 Autodesk | Enterprise Information Services

Introduction

© 2016 Autodesk | Enterprise Information Services 3

Some Definitions

Agile

“The division of tasks into short

phases of work and frequent

reassessment and adaptation of

plans.”

Data Architecture

“The models, policies, rules or

standards that govern which data is

collected, and how it is stored,

arranged, integrated.”

Logical Data Warehouse

“A logical abstraction layer which sits

on top of a variety of enterprise data

sources. The logical layer provides

durable data views without needing to

move or transform data from the

sources.”

Data Virtualization

“Data management that allows an

application to retrieve and

manipulate data without knowing

specific details about the data, such as

how it is formatted or where it is

physically located.”

© 2016 Autodesk | Enterprise Information Services 4

Agile

Data Architecture

Logical Data Warehouse

Data Virtualization

Agile Data Architecture Lifecycle

© 2016 Autodesk | Enterprise Information Services

Business Problem

© 2016 Autodesk | Enterprise Information Services 6

Multi-year Transition

Autodesk’s Business Challenge

Subscription

and

Rental

Perpetual

© 2016 Autodesk | Enterprise Information Services 7

© 2016 Autodesk | Enterprise Information Services 8

Most of us are in the same boat

© 2016 Autodesk | Enterprise Information Services

The Autodesk Agile Data Architecture

© 2016 Autodesk | Enterprise Information Services 10

Philosophy

Access and refine data

near the source

Published logical data

interfaces

Truly agile data

environment

© 2016 Autodesk | Enterprise Information Services 11

Autodesk Data Architecture

© 2016 Autodesk | Enterprise Information Services 12

Why Build the Logical Data Warehouse Data virtualization can be used

throughout your data pipeline!

© 2016 Autodesk | Enterprise Information Services 13

Big Data Ecosystem

© 2016 Autodesk | Enterprise Information Services 14

One More Definition

Data Governance

“The management of the

availability, usability, integrity,

and security of

the data employed in an

enterprise.”

© 2016 Autodesk | Enterprise Information Services 15

Logical Data Warehouses are an essential part of your Data

Governance Strategy for your Big Data Ecosystem

Availability

Channeling end user access

through a single governance

point simplifies administration

Usability

The LDW provides a single

repository for schema

definitions

Simplifies end-user access for

visualization and interpretation

Integrity

Only published views in the LDW

are publically available

Coupled with ownership,

guarantees the quality of the

data set

Security

The LDW can provide a single

point for authentication,

authorization and audit trail for

end user access

© 2016 Autodesk | Enterprise Information Services 16

The Logical Data Warehouse implements the philosophy

Access and refine data near the source No painful ETL pipelines for data

derivation

Leverage power of Spark for fast access

Published logical data interfaces Single access point for all of external data

sets

Enterprise-class governance across the big data ecosystem

Truly agile data environment Facilitates rapid change/evolution in your

big data ecosystem

Rip and replace becomes almost transparent – replace the system that delivers those views and you’re done

© 2016 Autodesk | Enterprise Information Services

Building the Agile Data Architecture at Autodesk

© 2016 Autodesk | Enterprise Information Services 18

Implementation Approach

Identify enterprise data sources

Harder than you think

All new custom streaming, highly-available

ingestion mechanism

Self-service or nearly so

Kafka/Flume

Leverage best-of breed for individual

components

Spark for ETL and fast access

Hcatalog/Oozie for metadata and job

orchestration

Denodo for LDW

Leverage highly-redundant cloud storage for

the data lake

S3

Develop canonical representations for your

data sets

Freakin’ hard!

Virtualize Spark fast access, data

warehouses and marts with a next

generation Logical DW

New implementations leverage the LDW

Legacy migrates opportunistically to Spark

fast access

© 2016 Autodesk | Enterprise Information Services 19

Data Consumers

Architecting the Data Virtualization Layer

Corporate

LDAP

Data Virt

Instance

1

Data Virt

Instance

n

Logging Infrastructure

CI/CD

Source

Repository

Data

Data

Code

Audit

Audit

Legacy

Data Sources

© 2016 Autodesk | Enterprise Information Services 20

Build an Information Architecture

Base views to abstract data sources

Layered derived views to reflect successively refined

derivations

Create the notion of publication for curated, externally

visible views

Expose services on top of views to make views more

accessible

Separate namespaces (schemas) by project or

subject area

Build the notion of commonality for views shared

across schemas

Naming conventions for all objects

Data portal for one-stop shopping for data consumers

© 2016 Autodesk | Enterprise Information Services 21

Building an LDW makes your Big

Data Ecosystem Enterprise-Ready

Autodesk is a registered trademark of Autodesk, Inc., and/or its subsidiaries and/or affiliates in the USA and/or other countries. All other brand names, product names, or trademarks belong to their respective holders. Autodesk

reserves the right to alter product and services offerings, and specifications and pricing at any time without notice, and is not responsible for typographical or graphical errors that may appear in this document.

© 2016 Autodesk | Enterprise Information Services. All rights reserved