has traditional mdm finally met its match?

40
Grab some coee and enjoy the pre-show banter before the top of the hour!

Upload: inside-analysis

Post on 12-Jul-2015

200 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Has Traditional MDM Finally Met its Match?

Grab some coffee and

enjoy the

pre-show

banter

before the top of the

hour!

Page 2: Has Traditional MDM Finally Met its Match?

The Briefing Room

Has Traditional MDM Finally Met Its Match?

Page 3: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected] @eric_kavanagh

Page 4: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!  Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 5: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Topics

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

This Month: INTEGRATION & DATA FLOW

October: ANALYTIC PLATFORMS

November: DISCOVERY & VISUALIZATION

Page 6: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

There’s a New Sheriff in Town!

Executive Summary

•  Speed and power trump the old way •  Traditional MDM is officially archaic •  YARN is the new fabric of MDM

Page 7: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

Robin Bloor is Chief Analyst at The Bloor Group

[email protected] @robinbloor

Page 8: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

RedPoint Global

! RedPoint Global is a data management and integrated marketing technology company

! RedPoint Data Management offers solutions designed for master data management (MDM), collaboration and architecture integration

! RedPoint Data Management for Hadoop is YARN-compliant and enables analysts to access and manipulate data directly within the Hadoop cluster

Page 9: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Guest: George Corugedo

George Corugedo is Chief Technology Officer & Co-Founder at RedPoint Global Inc. A mathematician and seasoned technology executive, George has over 20 years of business and technical expertise. As co-founder and CTO of RedPoint Global, George is responsible for leading the development of the RedPoint Convergent Marketing Platform™. A former math professor, George left academia to co-found Accenture’s Customer Insight Practice, which specialized in strategic data utilization, analytics and customer strategy. Previous positions include director of client delivery at ClarityBlue, Inc., a provider of hosted customer intelligence solutions to enterprise commercial entities, and COO/CIO of Riscuity, a receivables management company specializing in the utilization of analytics to drive collections.

Page 10: Has Traditional MDM Finally Met its Match?

MDM for the Modern Data Architecture September  2014  

Page 11: Has Traditional MDM Finally Met its Match?

11 © RedPoint Global Inc. 2014 Confidential

Purpose of MDM

Create correct and consistent data across the enterprise that earns trust in information and acceleration of growth.

Page 12: Has Traditional MDM Finally Met its Match?

12 © RedPoint Global Inc. 2014 Confidential

Vicious Cycle of Unmanaged Data

1. Master Data Issues

remain unaddressed or unresolved

2. Garbage in/garbage out creates

process confusion

3. Lack of process trust

slows business momentum

4. Data conflicts reinforce

siloed operations

Page 13: Has Traditional MDM Finally Met its Match?

13 © RedPoint Global Inc. 2014 Confidential

© Hortonworks Inc. 2014

A Data Architecture Under Pressure

Page 14: Has Traditional MDM Finally Met its Match?

14 © RedPoint Global Inc. 2014 Confidential

Broad Spectrum of Benefits Across Industries

Page 15: Has Traditional MDM Finally Met its Match?

15 © RedPoint Global Inc. 2014 Confidential

Gartner’s Nexus of Forces Making Things Worse

Page 16: Has Traditional MDM Finally Met its Match?

16 © RedPoint Global Inc. 2014 Confidential

Business Benefits of MDM

Page 17: Has Traditional MDM Finally Met its Match?

17 © RedPoint Global Inc. 2014 Confidential

Types of Data in a Typical Organization

•  Severe  shortage  of  Map  Reduce  skilled  resources  

•  Inconsistent  skills  lead  to  inconsistent  results  of  code  based  solu>ons  

• Nascent  technologies  require  mul>ple  point  solu>ons  

•  Technologies  are  not  enterprise  grade  •  Some  func>onality  may  not  be  possible  within  these  frameworks  

Challenges  to  Data  Lake  Approach  

• Data  is  ingested  in  its  raw  state  regardless  of  format,  structure  or  lack  of  structure  

• Raw  data  can  be  used  and  reused  for  differing  purposes  across  the  enterprise  

• Beyond  inexpensive  storage,  Hadoop  is  an  extremely  power  and  scalable  and  segmentable  computa>onal  plaMorm  

• Master  Data  can  be  fed  across  the  enterprise  and  deep  analy>cs  on  clean  data  is  immediately  enabled  

Benefits  of  a  Hadoop  Data  Lake  

Page 18: Has Traditional MDM Finally Met its Match?

18 © RedPoint Global Inc. 2014 Confidential

Big Data Can Become Big Information

! Inges>on  of  all  data  available  from  any  source,  format,  cadence,  structure  or  non-­‐structure  

! ELT  and  data  transforma>on,  refinement,  cleansing,  comple>on,  valida>on  and  standardiza>on  

! Geospa>al  processing  and  geocoding  

! Data  profiling,  lineage  and  metadata  management  

! Iden>ty  resolu>on  and  persistent  keying  and  en>ty  profile  management  

! ASribute  source  and  consumer  mapping  

Page 19: Has Traditional MDM Finally Met its Match?

19 © RedPoint Global Inc. 2014 Confidential

Data Lake Architecture for MDM

Data  Sources  CRM  

ERP  

Billing  

Subscriber  

Product  

Network  

Weather  

Compete  

Manuf.  

Clickstream  

Online  Chat  

Sensor  Data  

Social  Media  

Call  Detail  Records  

Fabrica>on  Logs  

Sales  Feedback  

Field  Feedback  

Field  Feedback  

+  

Page 20: Has Traditional MDM Finally Met its Match?

20 © RedPoint Global Inc. 2014 Confidential

Key Functions for Master Data Management

Master  Key  Management  

ETL  &  ELT   Data  Quality  

Web  Services  Integra>on  

Integra>on  &  Matching  

Process  Automa>on    &  Opera>ons  

• Profiling,  reads/writes,  transforma>ons  

• Single  project  for  all  jobs  

• Cleanse  data  • Parsing,  correc>on  • Geo-­‐spa>al  analysis  

• Grouping  • Fuzzy  match  

• Create  keys  • Track  changes  • Maintain  matches    over  >me  

• Consume  and  publish  • HTTP/HTTPS  protocols  • XML/JSON/SOAP  formats  

• Job  scheduling,  monitoring,  no>fica>ons  

• Central  point  of  control  • Meta  Data  Management  

Page 21: Has Traditional MDM Finally Met its Match?

21 © RedPoint Global Inc. 2014 Confidential

So How to Proceed?

Page 22: Has Traditional MDM Finally Met its Match?

22 © RedPoint Global Inc. 2014 Confidential

Overview - What is Hadoop/Hadoop 2.0

Hadoop  1.0  •  All  opera>ons  based  on  Map  Reduce  

•  Intrinsic  inconsistency  of  code  based  solu>ons  

•  Highly  skilled  and  expensive  resources  needed  

•  3rd  party  applica>ons  constrained  by  the  need  to  generate  code  

Hadoop  2.0  •  Introduc>on  of  the  YARN:                                                        

“a  general-­‐purpose,  distributed,  applica>on  management  framework  that  supersedes  the  classic  Apache  Hadoop  MapReduce  framework  for  processing  data  in  Hadoop  clusters.”  

•  Mature  applica>ons  can  now  operate  directly  on  Hadoop  

•  Reduce  skill  requirements  and  increased  consistency  

Page 23: Has Traditional MDM Finally Met its Match?

23 © RedPoint Global Inc. 2014 Confidential

RedPoint Data Management on Hadoop

Par>>oning  AM  /  Tasks  

Execu>on  AM  /  Tasks   Data  I/O   Key  /  Split  

Analysis  

Parallel  Sec>on  (UI)  

YARN  

MapReduce  

Page 24: Has Traditional MDM Finally Met its Match?

24 © RedPoint Global Inc. 2014 Confidential

Reference Hadoop Architecture

Monitoring and Management Tools

AMBARI

MAPREDUCE

REST

DATA REFINEMENT

HIVE PIG

HTTP

STREAM

STRUCTURE

HCATALOG (metadata services)

Query/Visualization/ Reporting/Analytical

Tools and Apps

SOURCE DATA

- Sensor Logs - Clickstream - Flat Files - Unstructured - Sentiment - Customer - Inventory

DBs

JMS Queue’s

Files Fil

es Files

Data Sources

RDBMS

EDW

INTERACTIVE

HIVE Server2

LOAD

SQOOP

WebHDFS

Flume

NFS

LOAD SQOOP/Hive

Web HDFS

YARN  

�   �   �   �   �   �   �   �   �   �  

�   �   �   �   �   �   �   �   �   �   �  

�   �   �   �   �   �   �   �   �   �   �  

�   �  

�   �  

�   n

HDFS  

1 �   �   �   �   �   �   �   �   �   �   �   �  

�  

�   �   �   �   �   �   �   �   �   �   �   �  �  

�   �   �   �   �   �   �   �   �   �   �   �  �  

�   �   �   �   �   �   �   �   �   �   �   �   �  

Page 25: Has Traditional MDM Finally Met its Match?

25 © RedPoint Global Inc. 2014 Confidential

RedPoint Functional Footprint

Monitoring and Management Tools

AMBARI

MAPREDUCE

REST

DATA REFINEMENT

HIVE PIG

HTTP

STREAM

STRUCTURE

HCATALOG (metadata services)

Query/Visualization/ Reporting/Analytical

Tools and Apps

SOURCE DATA

- Sensor Logs - Clickstream - Flat Files - Unstructured - Sentiment - Customer - Inventory

DBs

JMS Queue’s

Files Fil

es Files

Data Sources

RDBMS

EDW

INTERACTIVE

HIVE Server2

LOAD

SQOOP

WebHDFS

Flume

NFS

LOAD SQOOP/Hive

Web HDFS

YARN  

�   �   �   �   �   �   �   �   �   �  

�   �   �   �   �   �   �   �   �   �   �  

�   �   �   �   �   �   �   �   �   �   �  

�   �  

�   �  

�   n

HDFS  

1 �   �   �   �   �   �   �   �   �   �   �   �  

�  

�   �   �   �   �   �   �   �   �   �   �   �  �  

�   �   �   �   �   �   �   �   �   �   �   �  �  

�   �   �   �   �   �   �   �   �   �   �   �   �  

Page 26: Has Traditional MDM Finally Met its Match?

26 © RedPoint Global Inc. 2014 Confidential

RedPoint  

Benchmarks – Project Gutenberg

Map  Reduce   Pig  

Sample  MapReduce  (small  subset  of  the  entire  code  which  totals  nearly  150  lines):  public  static  class  MapClass extends  Mapper<WordOffset, Text, Text, IntWritable> {   private  final  static  String delimiters = "',./<>?;:\"[]{}-=_+()&*%^#$!@`~ \\|«»¡¢£¤¥¦©¬®¯±¶·¿";   private  final  static  IntWritable one = new  IntWritable(1);   private  Text word = new  Text();   public  void  map(WordOffset key, Text value, Context context) throws  IOException, InterruptedException { String line = value.toString();   StringTokenizer itr = new  StringTokenizer(line, delimiters);   while  (itr.hasMoreTokens()) {   word.set(itr.nextToken());   context.write(word, one);   }   }  }    

Sample  Pig  script  without  the  UDF:  SET  pig.maxCombinedSplitSize 67108864  SET  pig.splitCombination true  A = LOAD  '/testdata/pg/*/*/*';  B = FOREACH A GENERATE FLATTEN(TOKENIZE((chararray)$0)) AS  word;  C = FOREACH B GENERATE UPPER(word) AS  word;  D = GROUP  C BY  word;  E = FOREACH D GENERATE COUNT(C) AS  occurrences, group;  F = ORDER  E BY  occurrences DESC;  STORE F INTO  '/user/cleonardi/pg/pig-count';

>150 Lines of MR Code ~50 Lines of Script Code 0 Lines of Code

6 hours of development 3 hours of development 15 min. of development

6 minutes runtime 15 minutes runtime 3 minutes runtime

Extensive optimization needed

User Defined Functions required prior to running script

No tuning or optimization required

Page 27: Has Traditional MDM Finally Met its Match?

27 © RedPoint Global Inc. 2014 Confidential

Data Lake Architecture for MDM

Data  Sources  CRM  

ERP  

Billing  

Subscriber  

Product  

Network  

Weather  

Compete  

Manuf.  

Clickstream  

Online  Chat  

Sensor  Data  

Social  Media  

Call  Detail  Records  

Fabrica>on  Logs  

Sales  Feedback  

Field  Feedback  

Field  Feedback  

+  

Page 28: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Perceptions & Questions

Analyst: Robin Bloor

Page 29: Has Traditional MDM Finally Met its Match?

What Can You Do With a Data Lake?

Robin Bloor, Ph.D.

Page 30: Has Traditional MDM Finally Met its Match?

The Story So Far…

The old Data Warehouse World (environment) is fast dying – giving way to a dystopian future dominated by alien and mutant data, carried by vast unruly data streams that flow rapidly into dank and murky data lakes. This is Hadoop World.

HOW DO WE MAKE SENSE OF THIS?

Page 31: Has Traditional MDM Finally Met its Match?

The Big Data Architecture

Filtering Replicating& Routing

DataReservoir(Hadoop)

GeneralData

Server(s)

SpecialistData

Server(s)

DataPreparation

Data Flow(Optimize)

LocalWorkloads

ETL &Data Virt'n

Data Refinery and Processing Hub

Data

StreamingApps

LocalData

DataMart

TransApps

LocalData

DataMart

BIApps

LocalData

DataMart

OfficeApps

LocalData

DataMart

EventsData Flow

DataExport

The ApplicationLayer

The DataLayer

Applications may use the Data Hub Directly

Streams IOT Log files DaaS Mobile Devices Desktops Servers The Cloud Social media Etc.

Page 32: Has Traditional MDM Finally Met its Match?

The Main Point to Note

This is WAY more complicated than the old Data Warehouse

world

Page 33: Has Traditional MDM Finally Met its Match?

The Governance of Data

It’s all GOVERNANCE!!

DataReservoir(Hadoop)

GeneralData

Server(s)

SpecialistData

Server(s)

ETL &Data Virt'n

DataSecurity

Data Life Cycle Mgt

MDM & Business Glossary

DataCleansing

System Management

LocalWorkloads

MetaDataManagement

PerformanceMonitoring

& Mgt

DataLineage

DataMapping

DataExtractsData

Extracts

MetaDataDiscovery

Service Level Mgt

Corporate Data Hub

Page 34: Has Traditional MDM Finally Met its Match?

The Evolution of Hadoop

u  There were many components before YARN and Tez

u  But YARN and Tez have changed the picture

u  Hadoop is becoming the default scale-out file system and the OS for data flow

Page 35: Has Traditional MDM Finally Met its Match?

The Prognosis

The foundation is in place for a comprehensive Big Data Information Architecture…

But BUILDING such integrated systems

will not be easy

Page 36: Has Traditional MDM Finally Met its Match?

u  How does RedPoint see the role of Hadoop (ingest-point, ETL engines, MDM work area, analytical sandbox, database, etc.); some of these? All of these?

u  Often in the past, MDM implementations have proved to be disappointing. What makes RedPoint different given that the data environment is more challenging than ever?

u  Which companies/technologies do you see as competitive with RedPoint

Page 37: Has Traditional MDM Finally Met its Match?

u  Which verticals have shown the greatest interest in RedPoint?

u  How does a RedPoint engagement normally pan out?

u  If you are intent upon doing MDM, where is it best to start?

Page 38: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Page 39: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

Upcoming Topics

www.insideanalysis.com

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

This Month: INTEGRATION & DATA FLOW

October: ANALYTIC PLATFORMS

November: DISCOVERY & VISUALIZATION

Page 40: Has Traditional MDM Finally Met its Match?

Twitter Tag: #briefr

The Briefing Room

THANK YOU for your

ATTENTION!

Opening slide image courtesy of Wikimedia Commons