the motivation for hadoop hadoop: basic concepts what is...

Post on 09-Jun-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

Section 9 : Case Study #Objectives of this Session The Motivation For Hadoop

What problems exist with traditional large-scale computing systemsWhat requirements an alternative approach should haveHow Hadoop addresses those requirements

Hadoop: Basic ConceptsWhat Is Hadoop?The Hadoop Distributed File System (HDFS)How Google MapReduce Algorithm worksAnatomy of a Hadoop Cluster

Who uses Hadoop ?

db.suven.net# Not a part of 1Z0-061 or 1Z0-144 Certification test , but very important technology in BIG DATA Analysis

• Hadoop Solutions

– The most common problems Hadoop can solve

– The types of analytics often performed with Hadoop

– Where the data comes from ?

– The benefits of analyzing data with Hadoop

– How some real-world companies use Hadoop

• Hadoop Ecosystem

• Cloudera Software (All Open-Source)

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 2

Objectives of this Session … contd…

The Motivation For Hadoop

3compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

*MPI: Message Passing InterfacePVM: Parallel Virtual Machine

4compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Major Problem

5compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

1 GB = 1000 MB , 1 TB = 1000 GB , 1 PT = 1000 TB , 1 Exabyte = 1000 PTPT => petabyte , TB => teraByte

6compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

7compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

The Motivation For Hadoop

8compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

1.

2.

9compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

3.

4.

5.

10

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

Hadoop History

11compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Core Hadoop Concepts

12compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

Hadoop Components

13compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

HDFS

14compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

HDFSConcepts

15compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

HDFS : How Files Are Stored ?

16compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

How Files Are Stored: Example

17compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

IMP :

How MapReduce Work ?

18compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

MapReduce: The Mapper

19compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

Example :

20compiled by Rocky Jagtiani Tech Head for

SCTPL , 9892544177

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 21

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 22

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 23

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 24

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

25

Anatomy of a Hadoop Cluster :

compiled by Rocky Jagtiani Tech Head for SCTPL , 989254417726

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 27

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 28

Who uses Hadoop ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

29

Hadoop Solutions

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 30

A

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

31

B What is Problem if the data is coming ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

32

C

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 33

The most common problems Hadoop can solve :

We understand how each problem is solved using Hadoop in brief

D

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 34

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

35

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 36

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 37

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 38

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 39

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

40

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

41

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

42

How some real-world companies use HadoopE

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

43

Hadoop Ecosystem

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 44

Cloudera Software (All Open-Source)

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177

45

*enterprise data

warehouse (EDW)

Conclusion :

1) Input to mapper is

"Google is one of the richest companies "

"one who works with the Google is technical expert "

what will be the out put after reducing ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 46

Questions

2) Input to mapper is

"Cat is eating milk"

"Cat is very sweet and she likes milk"

"milk is in bottle"

what will be the out put after reducing ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 47

3) Input to mapper is

"Dollar is national currency for USA"

"Rupee is national currency for India"

"Dollar is ahead of Rupee in economy"

"India is developing country"

what will be the out put after Mapping ?

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 48

compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 49

what will be the out put after reducing ?

what will be the out put after shuffling?

top related