the motivation for hadoop hadoop: basic concepts what is...

49
1 compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 Section 9 : Case Study # Objectives of this Session The Motivation For Hadoop What problems exist with traditional large-scale computing systems What requirements an alternative approach should have How Hadoop addresses those requirements Hadoop: Basic Concepts What Is Hadoop? The Hadoop Distributed File System (HDFS) How Google MapReduce Algorithm works Anatomy of a Hadoop Cluster Who uses Hadoop ? db.suven.net # Not a part of 1Z0-061 or 1Z0-144 Certification test , but very important technology in BIG DATA Analysis

Upload: others

Post on 09-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

1compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

Section 9 : Case Study #Objectives of this Session The Motivation For Hadoop

What problems exist with traditional large-scale computing systemsWhat requirements an alternative approach should haveHow Hadoop addresses those requirements

Hadoop: Basic ConceptsWhat Is Hadoop?The Hadoop Distributed File System (HDFS)How Google MapReduce Algorithm worksAnatomy of a Hadoop Cluster

Who uses Hadoop ?

db.suven.net# Not a part of 1Z0-061 or 1Z0-144 Certification test , but very important technology in BIG DATA Analysis

Page 2: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

• Hadoop Solutions

– The most common problems Hadoop can solve

– The types of analytics often performed with Hadoop

– Where the data comes from ?

– The benefits of analyzing data with Hadoop

– How some real-world companies use Hadoop

• Hadoop Ecosystem

• Cloudera Software (All Open-Source)

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 2

Objectives of this Session … contd…

Page 3: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

The Motivation For Hadoop

3compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 4: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

*MPI: Message Passing InterfacePVM: Parallel Virtual Machine

4compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 5: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

Major Problem

5compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 6: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

1 GB = 1000 MB , 1 TB = 1000 GB , 1 PT = 1000 TB , 1 Exabyte = 1000 PTPT => petabyte , TB => teraByte

6compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 7: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

7compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 8: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

The Motivation For Hadoop

8compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 9: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

1.

2.

9compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 10: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

3.

4.

5.

10

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

Page 11: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

Hadoop History

11compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 12: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

Core Hadoop Concepts

12compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 13: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

Hadoop Components

13compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 14: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

HDFS

14compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 15: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

HDFSConcepts

15compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 16: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

HDFS : How Files Are Stored ?

16compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 17: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

How Files Are Stored: Example

17compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 18: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

IMP :

How MapReduce Work ?

18compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 19: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

MapReduce: The Mapper

19compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

Page 20: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

Example :

20compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Page 21: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 21

Page 22: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 22

Page 23: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 23

Page 24: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 24

Page 25: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

25

Anatomy of a Hadoop Cluster :

Page 26: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 989254417726

Page 27: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 27

Page 28: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 28

Who uses Hadoop ?

Page 29: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

29

Hadoop Solutions

Page 30: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 30

A

Page 31: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

31

B What is Problem if the data is coming ?

Page 32: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

32

C

Page 33: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 33

The most common problems Hadoop can solve :

We understand how each problem is solved using Hadoop in brief

D

Page 34: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 34

Page 35: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

35

Page 36: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 36

Page 37: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 37

Page 38: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 38

Page 39: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 39

Page 40: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

40

Page 41: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

41

Page 42: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

42

How some real-world companies use HadoopE

Page 43: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

43

Hadoop Ecosystem

Page 44: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 44

Cloudera Software (All Open-Source)

Page 45: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

45

*enterprise data

warehouse (EDW)

Conclusion :

Page 46: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

1) Input to mapper is

"Google is one of the richest companies "

"one who works with the Google is technical expert "

what will be the out put after reducing ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 46

Questions

Page 47: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

2) Input to mapper is

"Cat is eating milk"

"Cat is very sweet and she likes milk"

"milk is in bottle"

what will be the out put after reducing ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 47

Page 48: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

3) Input to mapper is

"Dollar is national currency for USA"

"Rupee is national currency for India"

"Dollar is ahead of Rupee in economy"

"India is developing country"

what will be the out put after Mapping ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 48

Page 49: The Motivation For Hadoop Hadoop: Basic Concepts What Is ...db.suvenconsultants.com/section_9_ApacheHadoop.pdf · Objectives of this Session The Motivation For Hadoop What problems

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 49

what will be the out put after reducing ?

what will be the out put after shuffling?