lecture 11 hadoop & sparkece.uprm.edu/~wrivera/icom6025/lecture11.pdf · hbase pig r hive...

25
Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Lecture 11 Hadoop & Spark

Upload: others

Post on 27-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Dr. Wilson Rivera

ICOM 6025: High Performance ComputingElectrical and Computer Engineering Department

University of Puerto Rico

Lecture 11Hadoop & Spark

Page 2: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

• Distributed File Systems• Hadoop Ecosystem• Hadoop Architecture and Features• Apache Spark

Outline

ICOM 6025: High Performance Computing 2

Page 3: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

How do we get data to the workers?

Compute Nodes

NAS

SAN

Page 4: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Distributed File System

• Don’t move data to workers… move workers to the data!– Store data on the local disks of nodes in the cluster– Start up the workers on the node that has the data local

• Why?– Not enough RAM to hold all the data in memory– Disk access is slow, but disk throughput is reasonable

• A distributed file system is the answer– GFS (Google File System) – HDFS (Hadoop Distributed File System)

Page 5: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Apache Hadoop Ecosystem

ICOM 6025: High Performance Computing 5

Hadoop Distributed File System (HDFS)

HBase

PIG R Hive

Cassandra

MapReduce

Page 6: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop

• Designed to reliably store data using commodity hardware – Redundant storage

• Designed to expect hardware failures – Fault tolerance mechanism

• Intended for large files– Not suitable for small data sets

• Not suitable for low latency data access

COP67276

Page 7: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop

• Files are stored as a collection of blocks – Blocks are 64 MB chunks of a file (configurable) – Blocks are replicated on 3 nodes (configurable)

• NameNode (NN) – Manages metadata about files and blocks

• DataNodes (DN) – store and serve blocks

Page 8: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Jobs and Tasks in Hadoop

• Job: a user-submitted map and reduce implementation to apply to a data set

• Task: a single mapper or reducer task– Failed tasks get retried automatically – Tasks run local to their data, ideally

• JobTracker (JT) manages job submission and task delegation

• TaskTrackers (TT) ask for work and execute tasks

COP67278

Page 9: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop Architecture

9

Name NodeJob Tracker

Data Node Data Node Data Node

Task Tracker Task Tracker Task Tracker

SecondaryName Node

Client

……

Page 10: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Typical Hadoop Cluster

ICOM 6025: High Performance Computing 10

Switch

Name Node

Switch Switch Switch

Job Tracker Secondary NN DN +TT

DN +TTDN +TTDN +TTDN +TT

DN +TT DN +TT DN +TT DN +TT

Rack 1 Rack 2 Rack 3 Rack N

Page 11: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop Fault Tolerance

• If a Task crashes:– Retry on another node:

• OK for a map because it has no dependencies• OK for a reduce because map outputs are on disk

• If a node crashes:– Re-launch its current task on other nodes– Re-run any maps the node previously ran to get

output data• If a task is going slowly (straggler):

– Launch second copy of task on another node (“speculative execution”)

Page 12: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop Fault Tolerance

• Reactive way– Worker failure

• Heartbeat, Workers are periodically pinged by master

– NO response = failed worker• If the processor of a worker fails, the tasks of that

worker are reassigned to another worker.

– Master failure• Master writes periodic checkpoints• Another master can be started from the last

checkpointed state• If eventually the master dies, the job will be aborted

12

Page 13: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop Data Locality

• Move computation to the data – Hadoop tries to schedule tasks on nodes with

the data – When not possible TT has to fetch data from DN – Thousands of machines read input at local disk

speed. Without this, rack switches limit read rate and network bandwidth becomes the bottleneck.

COP672713

Page 14: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Hadoop Scheduling• Fair Sharing

– conducts fair scheduling using greedy method to maintain data locality

• Delay– uses delay scheduling algorithm to achieve good data

locality by slightly compromising fairness restriction• LATE (Longest Approximate Time to End)

– improves MapReduce application performance in heterogenous environment, like virtualized environment, through accurate speculative execution

• Capacity– Supports multiple queues for shared users and

guarantees each queue a fraction of the capacity of the cluster

14

Page 15: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

MPI vs. Map Reduce

MPI Map ReduceProgramming in communication

Explicit Implicit

Fault tolerance No Yes

Main resources Memory I/O

Latency Low High

ICOM 6025: High Performance Computing 15

Page 16: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Other Frameworks

• Batch Processing– Hadoop

• Independent tasks – GraphLab

• Dependent tasks• Interactive Processing

– Drill • Low latency for Interactive data analysis

– Spark• Resilient distributed datasets• Primitives for in memory computing• 100x faster than Hadoop!!

• Stream processing – Storm, Apache S4

Page 17: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Apache Spark

ICOM 6025: High Performance Computing 17

HDFS

Page 18: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Use Memory Instead of Disk

iteration 1 iteration 2 . . .Input

HDFSread

HDFSwrite

HDFSread

HDFSwrite

Input

query 1

query 2

query 3

result 1

result 2

result 3

. . .

HDFSread

Page 19: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

In-Memory Data Sharing

iteration 1 iteration 2 . . .Input

HDFSread

Input

query 1

query 2

query 3

result 1

result 2

result 3. . .

one-timeprocessing

Distributedmemory

10-100x faster than network and disk

Page 20: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Resilient Distributed Datasets (RDDs)

• Write programs in terms of operations on distributed datasets

• Partitioned collections of objects spread across a cluster, stored in memory or on disk

• RDDs built and manipulated through a diverse set of parallel transformations (map, filter, join)and actions (count, collect, save)

• RDDs automatically rebuilt on machine failure

Page 21: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

The Spark Computing Framework

• Provides programming abstraction and parallel runtime to hide complexities of fault-tolerance and slow machines

• “Here’s an operation, run it on all of the data”– I don’t care where it runs (you schedule that)– In fact, feel free to run it twice on different

nodes

Page 22: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Memorybandwidth

Networkbandwidth

Tradeoff Space

Granularityof Updates

Write Throughput

Fine

Coarse

Low High

K-V stores,databases,RAMCloud

Best for batchworkloads

Best fortransactional

workloads

HDFS RDDs

Page 23: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Scalability

184

111

76

116

80

62

15 6 3

0

50

100

150

200

250

25 50 100

Iter

atio

n tim

e (s

)

Number of machines

HadoopHadoopBinMemSpark

274

157

106

197

121

87

143

61

33

0

50

100

150

200

250

300

25 50 100

Iter

atio

n tim

e (s

)

Number of machines

Hadoop HadoopBinMemSpark

Logistic Regression K-Means

Page 24: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

Spark and Map Reduce Differences

HadoopMap Reduce

Spark

Storage Disk only In-memory or on diskOperations Map and

ReduceMap, Reduce, Join, Sample, etc…

Execution model Batch Batch, interactive, streaming

Programmingenvironments

Java Scala, Java, R, and Python

Page 25: Lecture 11 Hadoop & Sparkece.uprm.edu/~wrivera/ICOM6025/Lecture11.pdf · HBase PIG R Hive Cassandra MapReduce . Hadoop • Designed to reliably store data using ... High Performance

• Distributed File Systems• Hadoop Ecosystem• Hadoop Architecture and Features• Apache Spark

Summary

ICOM 6025: High Performance Computing 25