hadoop, oracle and the big data revolution collaborate 2013

148
Hadoop, Oracle and the Industrial Revolution of Data Guy Harrison, Dell Software Group

Upload: guy-harrison

Post on 09-Jun-2015

1.473 views

Category:

Technology


3 download

DESCRIPTION

Presentation given at Collaborate 2013

TRANSCRIPT

Page 1: Hadoop, Oracle and the big data revolution collaborate 2013

Hadoop, Oracle and the Industrial Revolution of Data

Guy Harrison, Dell Software Group

Page 2: Hadoop, Oracle and the big data revolution collaborate 2013

Hadoop, Oracle and the Industrial Revolution of Data

Guy Harrison

Executive Director, R&DInformation management group

Page 3: Hadoop, Oracle and the big data revolution collaborate 2013

3 Software Group

Introductions

www.guyharrison.net

[email protected]

http://twitter.com/guyharrison

Page 4: Hadoop, Oracle and the big data revolution collaborate 2013

4 Software Group

Dell, Quest and Toad

Page 5: Hadoop, Oracle and the big data revolution collaborate 2013

5 Software Group

Page 6: Hadoop, Oracle and the big data revolution collaborate 2013

6 Software Group

Page 7: Hadoop, Oracle and the big data revolution collaborate 2013

7 Software Group

Page 8: Hadoop, Oracle and the big data revolution collaborate 2013

8 Software Group

Page 9: Hadoop, Oracle and the big data revolution collaborate 2013

9 Software Group

Page 10: Hadoop, Oracle and the big data revolution collaborate 2013

10 Software Group

Page 11: Hadoop, Oracle and the big data revolution collaborate 2013

11 Software Group

Blue

Yellow

Red

0 10 20 30 40 50 60 70 80

Star trek shirt fatality analysis

Pct

Page 12: Hadoop, Oracle and the big data revolution collaborate 2013

12 Software Group

Page 13: Hadoop, Oracle and the big data revolution collaborate 2013

13 Software Group

Page 14: Hadoop, Oracle and the big data revolution collaborate 2013

14 Software Group

Quest Software is now part of Dell

Page 15: Hadoop, Oracle and the big data revolution collaborate 2013

15 Software Group

“Big” Data?

Page 16: Hadoop, Oracle and the big data revolution collaborate 2013

16 Software Group

Three or Four “V”s

VolumeTerabytesPetabytesExabytesZetabytes

VarietyStructuredUnstructuredHuman GeneratedMachine Generated

VelocityUser populations xTransaction rates xMachine data

Value Competitive or Collective advantage

Page 17: Hadoop, Oracle and the big data revolution collaborate 2013

17 Software Group

Data volumes have always been increasing….

2006 Perspective

Page 18: Hadoop, Oracle and the big data revolution collaborate 2013

18 Software Group

Though the absolute volumes are boggling…

Human Brain

Google

Living Human Genomes

Digital information 2008

Total Digital capacity

Digital information created 2011

1E+09 1E+16 1E+23

2.81E+15

1.10E+17

5.48E+18

4.87E+18

1.18E+21

2.13E+21

Gigabyte Tera-byte

Petabyte Exabyte zettabyte

Page 19: Hadoop, Oracle and the big data revolution collaborate 2013

19 Software Group

Velocity

Page 20: Hadoop, Oracle and the big data revolution collaborate 2013

20 Software Group

Page 21: Hadoop, Oracle and the big data revolution collaborate 2013

21 Software Group

Fail whales

Page 22: Hadoop, Oracle and the big data revolution collaborate 2013

22 Software Group

Variety OR – the industrial Revolution of data

Page 23: Hadoop, Oracle and the big data revolution collaborate 2013

23 Software Group

Page 24: Hadoop, Oracle and the big data revolution collaborate 2013

24 Software Group

Page 25: Hadoop, Oracle and the big data revolution collaborate 2013

25 Software Group

Page 26: Hadoop, Oracle and the big data revolution collaborate 2013

26 Software Group

Page 27: Hadoop, Oracle and the big data revolution collaborate 2013

27 Software Group

Page 28: Hadoop, Oracle and the big data revolution collaborate 2013

28 Software Group

Page 29: Hadoop, Oracle and the big data revolution collaborate 2013

29 Software Group

Page 30: Hadoop, Oracle and the big data revolution collaborate 2013

30 Software Group

Data: now and then

Generated internally

Key to operational efficiency

1993

Generated externally

Key to competitiveness

Source of product innovation

Changing our world

2013

Page 31: Hadoop, Oracle and the big data revolution collaborate 2013

31 Software Group

“Big” data driven by the smallest devices

Page 32: Hadoop, Oracle and the big data revolution collaborate 2013

32 Software Group

Smartphone hardware

• Quad-core 1.4 GHz CPU

• 1GB RAM

• 64GB Storage

• 1080p display

• GSM/Bluetooth/WiFi Network

• 8MP Camera

• GPS & Compass

Page 33: Hadoop, Oracle and the big data revolution collaborate 2013

33 Software Group

Smartphone software

Page 34: Hadoop, Oracle and the big data revolution collaborate 2013

34 Software Group

Page 35: Hadoop, Oracle and the big data revolution collaborate 2013

35 Software Group

Page 36: Hadoop, Oracle and the big data revolution collaborate 2013

36 Software Group

Page 37: Hadoop, Oracle and the big data revolution collaborate 2013

37 Software Group

Page 38: Hadoop, Oracle and the big data revolution collaborate 2013

Name: Willy Bowman

Nationality: German

DON’T MENTION THE WAR

Page 39: Hadoop, Oracle and the big data revolution collaborate 2013

39 Software Group

Data Input

Page 40: Hadoop, Oracle and the big data revolution collaborate 2013

40 Software Group

Page 41: Hadoop, Oracle and the big data revolution collaborate 2013

41 Software Group

Siri

From now on, I’ll call you ‘An Ambulance’. OK?

“Siri call me an ambulance”

I found 14 bridges nearby:

“I want to jump off a bridge”

Page 42: Hadoop, Oracle and the big data revolution collaborate 2013

42 Software Group

Sixth-Sense

Page 43: Hadoop, Oracle and the big data revolution collaborate 2013

43 Software Group

Page 44: Hadoop, Oracle and the big data revolution collaborate 2013

44 Software Group

Page 45: Hadoop, Oracle and the big data revolution collaborate 2013

45 Software Group

Brain Control

Page 46: Hadoop, Oracle and the big data revolution collaborate 2013

46 Software Group

Page 47: Hadoop, Oracle and the big data revolution collaborate 2013

47 Software Group

Page 48: Hadoop, Oracle and the big data revolution collaborate 2013

48 Software Group

Page 49: Hadoop, Oracle and the big data revolution collaborate 2013

49 Software Group

Page 50: Hadoop, Oracle and the big data revolution collaborate 2013

50 Software Group

Page 51: Hadoop, Oracle and the big data revolution collaborate 2013

51 Software Group

The intrumented human

• Bluetooth Personal Area Network

• 3G/WiFi Wide Area Network

• GPS• Storage

• Pulse, temp monitor

• Silent alarms• Pedometer, sleep

monitoring

• Compass • Camera• Mike/earphones• Heads up display

Page 52: Hadoop, Oracle and the big data revolution collaborate 2013

52 Software Group

All this requires and generates huge data sets

But what else are they good for?

Page 53: Hadoop, Oracle and the big data revolution collaborate 2013

53 Software Group

The data “exhaust” itself generates new opportunites

Companies want to generate competitive advantage through “Big Data analytics”

Page 54: Hadoop, Oracle and the big data revolution collaborate 2013

54 Software Group

Machine LearningPrograms that evolve with “experience”

Collective IntelligencePrograms that use inputs from “crowds’ to seem intelligent

Predictive AnalyticsPrograms that extrapolate from existing data into the future

Big Data Analytics

Page 55: Hadoop, Oracle and the big data revolution collaborate 2013

55 Software Group

Page 56: Hadoop, Oracle and the big data revolution collaborate 2013

56 Software Group

Page 57: Hadoop, Oracle and the big data revolution collaborate 2013

57 Software Group

Page 58: Hadoop, Oracle and the big data revolution collaborate 2013

58 Software Group

Page 59: Hadoop, Oracle and the big data revolution collaborate 2013

59 Software Group

Page 60: Hadoop, Oracle and the big data revolution collaborate 2013

60 Software Group

Page 61: Hadoop, Oracle and the big data revolution collaborate 2013

61 Software Group

Page 62: Hadoop, Oracle and the big data revolution collaborate 2013

62 Software Group

Page 63: Hadoop, Oracle and the big data revolution collaborate 2013

63 Software Group

Page 64: Hadoop, Oracle and the big data revolution collaborate 2013

64 Software Group

Page 65: Hadoop, Oracle and the big data revolution collaborate 2013

65 Software Group

Page 66: Hadoop, Oracle and the big data revolution collaborate 2013

66 Software Group

Collective Intelligence

Search Optimization

Recommendation Systems

Security• Vulnerability• Penetration

Detection

Fraud Detection

Predictive Analytics• Churn • Defaults

Medical• Risk analysis• Diagnosis• Prognosis

Game optimization

Advertising• Targeting• Tailoring

Page 67: Hadoop, Oracle and the big data revolution collaborate 2013

67 Software Group

Collective Intelligence beats Artificial Intelligence ?

Page 68: Hadoop, Oracle and the big data revolution collaborate 2013

68 Software Group

Page 69: Hadoop, Oracle and the big data revolution collaborate 2013

69 Software Group

Page 70: Hadoop, Oracle and the big data revolution collaborate 2013

70 Software Group

Page 71: Hadoop, Oracle and the big data revolution collaborate 2013

71 Software Group

Page 72: Hadoop, Oracle and the big data revolution collaborate 2013

72 Software Group

Page 73: Hadoop, Oracle and the big data revolution collaborate 2013

73 Software Group

For the last 40 years AI has been consistently disappointing

Page 74: Hadoop, Oracle and the big data revolution collaborate 2013

74 Software Group

Page 75: Hadoop, Oracle and the big data revolution collaborate 2013

75 Software Group

Page 76: Hadoop, Oracle and the big data revolution collaborate 2013

76 Software Group

In 2011 AI made a comeback

Page 77: Hadoop, Oracle and the big data revolution collaborate 2013

77 Software Group

Page 78: Hadoop, Oracle and the big data revolution collaborate 2013

78 Software Group

Page 79: Hadoop, Oracle and the big data revolution collaborate 2013

79 Software Group

Page 80: Hadoop, Oracle and the big data revolution collaborate 2013

80 Software Group

Page 81: Hadoop, Oracle and the big data revolution collaborate 2013

81 Software Group

Page 82: Hadoop, Oracle and the big data revolution collaborate 2013

82 Software Group

Page 83: Hadoop, Oracle and the big data revolution collaborate 2013

83 Software Group

Page 84: Hadoop, Oracle and the big data revolution collaborate 2013

84 Software Group

Google: Pioneers of Big Data

Page 85: Hadoop, Oracle and the big data revolution collaborate 2013

85 Software Group

Page 86: Hadoop, Oracle and the big data revolution collaborate 2013

86 Software Group

Page 87: Hadoop, Oracle and the big data revolution collaborate 2013

87 Software Group

Page 88: Hadoop, Oracle and the big data revolution collaborate 2013

88 Software Group

Page 89: Hadoop, Oracle and the big data revolution collaborate 2013

89 Software Group

Google File System (GFS)

Map Reduce BigTableChubby

Google Applications

Google Software Architecture

Page 90: Hadoop, Oracle and the big data revolution collaborate 2013

90 Software Group

START REDUCEMAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

MAPMAP

Map Reduce

Page 91: Hadoop, Oracle and the big data revolution collaborate 2013

91 Software Group

HDFS

MAPPER

MAPPER

MAPPER

MAPPER

MAPPER

MAPPER

MAPPER

MAPPER

SCANSORT

MAPPER

MAPPER

MAPPER

MAPPER

AGGREGATE

REDUCECLIENT

Multi-stage Map-Reduce

Page 92: Hadoop, Oracle and the big data revolution collaborate 2013

92 Software Group

Schema on Read vs Schema on Write

Page 93: Hadoop, Oracle and the big data revolution collaborate 2013

93 Software Group

Schema on Read vs Schema on Write

Data

Analyse

Aggregate

Normalize

Cleanse

Code

ExtractLoad Transform Data

Warehouse

Data LoadHadoop

Analyse

Cleanse

Code

Utilize

Schema on Write

Schema on Read

Utilize

Page 94: Hadoop, Oracle and the big data revolution collaborate 2013

94 Software Group

Hadoop: Open Source Map-Reduce Stack

Page 95: Hadoop, Oracle and the big data revolution collaborate 2013

95 Software Group

Hadoop at Yahoo

Yahoo! Hadoop cluster:4000 nodes16PB disk64 TB of RAM32,000 Cores

Page 96: Hadoop, Oracle and the big data revolution collaborate 2013

96 Software Group

Page 97: Hadoop, Oracle and the big data revolution collaborate 2013

97 Software Group

Hadoop 1.0 Architecture

MAP REDUCE (DISTRIBUTED PROCESSING)

HADOOP CLIENT (JAVA, PIG, HIVE)

HDFS (DISTRIBUTED

STORAGE)

JOB TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

NAME NODE

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

SECONDARY NAME NODE

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

DATA NODE TASK TRACKER

Page 98: Hadoop, Oracle and the big data revolution collaborate 2013

98 Software Group

Hadoop File System (HDFS)

Hadoop Map ReduceHbase

(Database)ZooKeeper(Locking)

SQOOP(RDBMS loader)

Hive(Query)

Pig(Scripting)

Flume(Log Loader)

Oozie (Workflow manager)

Page 99: Hadoop, Oracle and the big data revolution collaborate 2013

99 Software Group

HBaseA Real time database built on Hadoop

ASM

Datafiles

Buffer Cache

Table Table

Redo

Disks

LogBuffe

r

HDFS

HFile

MemStore

Table Table

WA Log

Disks

HFile

Page 100: Hadoop, Oracle and the big data revolution collaborate 2013

100 Software Group

Name Site Counter

Dick Ebay 507,018

Dick Google 690,414

Jane Google 716,426

Dick Facebook 723,649

Jane Facebook 643,261

Jane ILoveLarry.com 856,767

Dick MadBillFans.com 675,230

NameId Name

1 Dick

2 Jane

SiteId SiteName

1 Ebay

2 Google

3 Facebook

4 ILoveLarry.com

5 MadBillFans.com

NameId SiteId Counter

1 1 507,018

1 3 690,414

2 3 716,426

1 3 723,649

2 3 643,261

2 4 856,767

1 5 675,230

Id Name Ebay Google Facebook (other columns) MadBillFans.com

1 Dick 507,018 690,414 723,649 . . . . . . . . . . . . . . 675,230

Id Name Google Facebook (other columns) ILoveLarry.com

2 Jane 716,426 643,261 . . . . . . . . . . . . . . 856,767

Hbase Data Model

Page 101: Hadoop, Oracle and the big data revolution collaborate 2013

101 Software Group

Hive

Page 102: Hadoop, Oracle and the big data revolution collaborate 2013

102 Software Group

Page 103: Hadoop, Oracle and the big data revolution collaborate 2013

103 Software Group

SQL

JAV

A

RES

ULT

S

Page 104: Hadoop, Oracle and the big data revolution collaborate 2013

104 Software Group

Other SQL-like Hadoop Interfaces

• Cloudera Impala

• MapR Drill

• Aster

• Greenplumb (Pivotal HD)

• Paraccel

• Hadapt

• Oracle SQL Connector for Hadoop (External Table interface to HDFS)

Page 105: Hadoop, Oracle and the big data revolution collaborate 2013

105 Software Group

Pig

Page 106: Hadoop, Oracle and the big data revolution collaborate 2013

106 Software Group

Pig Latin

SQL or Hive QL

Page 107: Hadoop, Oracle and the big data revolution collaborate 2013

107 Software Group

Meanwhile, back at the Deathstar…

Page 108: Hadoop, Oracle and the big data revolution collaborate 2013

108 Software Group

Page 109: Hadoop, Oracle and the big data revolution collaborate 2013

109 Software Group

Page 110: Hadoop, Oracle and the big data revolution collaborate 2013

110 Software Group

Oracle Exadata

Database servers

64 cores, 576 GB RAM

Storage Servers112 cores, 100 TB SAS or336 TB SATA plus5 TB SSD

Page 111: Hadoop, Oracle and the big data revolution collaborate 2013

111 Software Group

Economies

Exadata

Hadoop

$0 $1,000 $2,000 $3,000 $4,000 $5,000 $6,000

$4,911

$750

Exadata vs Hadoop $$/TB (Hardware only)

Page 112: Hadoop, Oracle and the big data revolution collaborate 2013
Page 113: Hadoop, Oracle and the big data revolution collaborate 2013

114 Software Group

Oracle Big Data Appliance

18 Sun X4270 M2 servers− 48GB RAM per node (864GB total)− 2x6 Core CPU per node (216 total)− 12x2TB HDD per node (216

spindles, 864 TB)− 40Gb/s Infiniband between nodes− 10Gb/s Ethernet to datacentre

Competitive Pricing

www.oracle.com/us/bigdata/index.html

Page 114: Hadoop, Oracle and the big data revolution collaborate 2013

115 Software Group

Big Data Appliance Software

• Cloudera Enterprise

• Oracle Enterprise R

• Oracle NoSQL

• Oracle Big Data Connectors

Page 115: Hadoop, Oracle and the big data revolution collaborate 2013

116 Software Group

ORACLEEXADATA

ORACLEEXALOGIC

ORACLEBIG DATA

APPLIANCE

ORACLE NOSQL

ORACLE LOADER FOR HADOOPAPACHE

HADOOP ORACLE RDBMS

ORACLE WEBLOGIC

ORACLE EXALYTICS

ORACLE ESSBASE

ORACLE TIMES TEN

Latency

Storage Costs

Page 116: Hadoop, Oracle and the big data revolution collaborate 2013

117 Software Group

The following week at the Borg collective….

Page 117: Hadoop, Oracle and the big data revolution collaborate 2013

Pg. 118© 2012 Quest Software Inc. All rights reserved. 118

Page 118: Hadoop, Oracle and the big data revolution collaborate 2013

119 Software Group

Page 119: Hadoop, Oracle and the big data revolution collaborate 2013

120 Software Group

Integrating Hadoop and RDBMS

Page 120: Hadoop, Oracle and the big data revolution collaborate 2013

121 Software Group

Scenario #1: Reference data in RDBMS

CUSTOMERS

WEBlOGS

PRODUCTS

HDFS

RDBMS

Page 121: Hadoop, Oracle and the big data revolution collaborate 2013

122 Software Group

Scenario #2: Hadoop for off-line analytics

CUSTOMERS

PRODUCTS

RDBMS

SALESHISTORY

HDFS

Page 122: Hadoop, Oracle and the big data revolution collaborate 2013

123 Software Group

Scenario #3: MapReduce output to RDBMS

WEBLOGSSUMMARY

RDBMS

DB QUERYTOOL

WEBLOGS

HDFS

Page 123: Hadoop, Oracle and the big data revolution collaborate 2013

124 Software Group

Scenario #4: Hadoop as RDBMS “active archive”

SALES 2011

HDFS

RDBMS

QUERYTOOL

SALES 2010

SALES 2009

SALES 2008

SALES 2009

SALES 2008

Page 124: Hadoop, Oracle and the big data revolution collaborate 2013

125 Software Group

The Big Data Stack

Page 125: Hadoop, Oracle and the big data revolution collaborate 2013

126 Software Group

HDFS

MAP-REDUCE HBASE

PIG

CASCADING

MAHOUT

JAVA APIHIVE

R (ET AL)JAVA API

DATA SCIENTIST

Page 126: Hadoop, Oracle and the big data revolution collaborate 2013

127 Software Group

Page 127: Hadoop, Oracle and the big data revolution collaborate 2013

128 Software Group

HDFS

MAP-REDUCE HBASE

PIG

CASCADING

MAHOUT

JAVA APIHIVE

R (ET AL)JAVA API

DATA SCIENTISTBIG DATA ANALYTICS SOFTWARE

Page 128: Hadoop, Oracle and the big data revolution collaborate 2013

129 Software Group

BIG DATA ANALYTICS

INDEXING AND

SEARCH VISUALIZATION

RECOMMENDERS

CLUSTERING

CLASSIFICATION

EXPERT SYSTEMS (LIKE WATSON)

OPTIMIZATIONMACHINE LEARNING

PREDICTIVE ANALYTICS

COLLECTIVE INTELLIGENCE

BASKET ANALYSIS

SENTIMENT ANALYSIS

Page 129: Hadoop, Oracle and the big data revolution collaborate 2013

130 Software Group

In Summary….

Page 130: Hadoop, Oracle and the big data revolution collaborate 2013

131 Software Group

Hadoop is….

Page 131: Hadoop, Oracle and the big data revolution collaborate 2013

132 Software Group

Economical

Exadata

Hadoop

$0 $1,000 $2,000 $3,000 $4,000 $5,000 $6,000

$4,911

$750

Exadata vs Hadoop $$/TB (Hardware only)

Page 132: Hadoop, Oracle and the big data revolution collaborate 2013

133 Software Group

Proven at Scale

Page 133: Hadoop, Oracle and the big data revolution collaborate 2013

134 Software Group

A platform for Advanced analytics

Page 134: Hadoop, Oracle and the big data revolution collaborate 2013

135 Software Group

ETL Free

Data

Analyse

Aggregate

Normalize

Cleanse

Code

Extract Load Transform Data Warehouse

Utilize

Data LoadHadoop

Analyse

Cleanse

Code

Utilize

Schema on Write

Schema on Read

Page 135: Hadoop, Oracle and the big data revolution collaborate 2013

136 Software Group

The most concrete technology enabling the Big Data revolution

Page 136: Hadoop, Oracle and the big data revolution collaborate 2013

137 Software Group

Hadoop is not….

Page 137: Hadoop, Oracle and the big data revolution collaborate 2013

138 Software Group

A replacement for RDBMS

But future Enterprise Data Architectures will likely incorporate Hadoop side by side with RDBMS

Page 138: Hadoop, Oracle and the big data revolution collaborate 2013

139 Software Group

Suitable for OLTP

Though OLTP systems can be built with Hadoop-compatible NoSQL systems such as HBase and Cassandra

Page 139: Hadoop, Oracle and the big data revolution collaborate 2013

140 Software Group

A complete solution

Hadoop alone only solves the storage challenge of Big Data

Page 140: Hadoop, Oracle and the big data revolution collaborate 2013

141 Software Group

Shameless plugs

Page 141: Hadoop, Oracle and the big data revolution collaborate 2013

142 Software Group

Toad for Cloud Databases

Page 142: Hadoop, Oracle and the big data revolution collaborate 2013

143 Software Group

Toad BI Suite

Business Intelligence solutions with first class support for Hadoop, Oracle and many other platforms

Page 143: Hadoop, Oracle and the big data revolution collaborate 2013

144 Software Group

Kitenga Analytics Suite

Page 144: Hadoop, Oracle and the big data revolution collaborate 2013

145 Software Group

SharePlex® for Hadoop

Redo-logs

Change Data Capture

JMS Queue Hadoop Poster

BatchedHDFS File Copy Audit / Change

Data

HBase RealTime replication

Page 145: Hadoop, Oracle and the big data revolution collaborate 2013

146 Software Group

Toad for Hadoop

Hive Query IDE

Oracle <-> Hadoop data management

Basic Hadoop administration

Beta June

Page 146: Hadoop, Oracle and the big data revolution collaborate 2013

147 Software Group