yarn overview

Download YARN overview

Post on 09-Jul-2015

1.083 views

Category:

Software

2 download

Embed Size (px)

DESCRIPTION

Apache Hadoop YARN Overview

TRANSCRIPT

PowerPoint Presentation

YARN OverviewBDS gopass2002@gmail.com2014-04-181Apache Hadoop YARNYARN DaemonsWhy is YARN needed?SchedulingFault ToleranceQ & A

2Apache Hadoop YARN3HADOOP 1 vs. 2 YARN

4Apache Hadoop YARN(Yet Another Resource Negotiator)YARN (Resource Manager) (Applications Manager)(Scheduler) (Application Master) (Node Manager)(Container)

hadoop-0.23.0 (2011 11 11 ) hadoop-2.4.0 (2014 04 07 )

5YARN Daemons6YARN DaemonsResource Manager(RM)master node global resource schedulerapplication Node Manager(NM)

slave node node container node

7YARN DaemonsContainersRM NM slave node CPU core, memory applications container Application Master(AM)application application spec container application task container RM

8Why is YARN needed?9Why is YARN needed?Cluster HADOOP 1 Job Tracker 1 ( ) 4000 1 JobTracker job

HADOOP 2 YARNJob Tracker Resource Manager Application Master Application Application Application Master task

10Job TrackerResourceManagerApplicationMasterApplicationMasterApplicationMasterHADOOP 2 YARNTaskTrackerTaskComputationJobTrackerJob & TaskManagementResourceSchedulingPlatform + ApplicationFrameworkResourceManagerResourceSchedulingNodeManagerResourceMonitoring & EnforcementApplication MasterJob & TaskManagementPlatformApplication FrameworkHadoop 1Hadoop 2 YARN11Why is YARN needed?Application HADOOP 1MapReduce Application

HADOOP 2 YARN M/R application

12

Why is YARN needed? HADOOP 1Slot (memory, cpu cores) slot job Job

HADOOP 2 YARN spec container container spec task container task

13HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerSlotSlotTaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocation1414HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerSlotSlotTaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocationClient1515HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerSlotSlotTaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocationClient1616HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerMap(ready)Map(ready)TaskTrackerReduce(ready)Reduce(ready)MapReduce StatusJob SubmissionResource allocationClient1717HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerMap(running)Map(running)TaskTrackerReduce(ready)Reduce(ready)MapReduce StatusJob SubmissionResource allocationClient1818HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerMap(done)Map(done)TaskTrackerReduce(running)Reduce(running)MapReduce StatusJob SubmissionResource allocationClient1919HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerMap(done)Map(done)TaskTrackerReduce(running)Reduce(running)MapReduce StatusJob SubmissionResource allocationClient20Client20HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerMap(done)Map(done)TaskTrackerReduce(running)Reduce(running)MapReduce StatusJob SubmissionResource allocationClient21Client(pending)21HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerSlotSlotTaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocation22Client22HADOOP 1 slot base resource allocateJobTrackerTaskTrackerSlotSlotTaskTrackerSlotSlotTaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocation23Client23HADOOP 1 slot base resource allocateJobTrackerTaskTrackerMap(ready)Map(ready)TaskTrackerReduce(ready)Reduce(ready)TaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocation24Client24HADOOP 1 slot base resource allocateJobTrackerTaskTrackerMap(running)Map(running)TaskTrackerReduce(ready)Reduce(ready)TaskTrackerSlotSlotMapReduce StatusJob SubmissionResource allocation25Client25HADOOP 2 YARNcontainer base resource allocationResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation26HADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation27HADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation28AppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation29AppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation30AppMasterContainerContainerHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation31AppMasterMap(ready)Map(ready)HADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation32AppMasterMap(running)Map(running)HADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation33AppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation34AppMasterContainerContainerHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation35AppMasterReduce(ready)Reduce(ready)HADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation36AppMasterReduce(running)Reduce(running)HADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation37AppMasterReduce(running)Reduce(running)ClientHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation38AppMasterReduce(running)Reduce(running)ClientAppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation39AppMasterReduce(running)Reduce(running)ClientAppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation40AppMasterReduce(running)Reduce(running)ClientContainerContainerAppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation41AppMasterReduce(running)Reduce(running)ClientMap(ready)Map(ready)AppMasterHADOOP 2 YARNcontainer base resource allocationClientResourceManagerNodeManagerNodeManagerNodeManagerMapReduce StatusJob SubmissionResource RequestResource Allocation42AppMasterReduce(running)Reduce(running)ClientMap(ready)Map(ready)AppMasterScheduling43FIFO Scheduler (Hadoop v 1.x)Hadoop 1.x FIFO 5 (very high, high, normal, low, very low) slot , Fair , 44Fair Scheduler (Hadoop v 1.x)FIFO task slot pool pool slot Pool pool pool slot job (pool) job task slot job task slot pool 45Fair Scheduler (Hadoop v 1.x)Slot slot cpu time pool pool slot cpu time FIFO

Preemption pool slot mapred.fairscheduler.preemption (: false) preemption.timeout pool job kill slot slot job job job 46Capacity Scheduler (Hadoop v 2.x)Hadoop 2.x YARN yarn.resourcemanager.scheduler.classorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler capacity capacity-scheduler.xml ()root capacity hadoop 2.x preemption

47Capacity Scheduler Configurationyarn.scheduler.capacity..queues ( , ) yarn.scheduler.capacity..capacity capacity (0~100)yarn.scheduler.capacity..maximum-capacity capacity(0~100)yarn.scheduler.capacity..state yarn.scheduler.capacity..user-limit-factor user (0~1)48Capacity Scheduler Configuration49rootcluster node count: 10memory per node: 10GBcluster total capacity: 100GByarn.nodemanager.resource.memory-mb = 10240capacity: 100GBCapacity Scheduler Configuration50rootABroot.queues=A, Broot.A.capacity=60root.B.capacity=40capacity: 100GBcapacity: 60GBcapacity: 40GBCapacity Scheduler Configuration51rootABroot.B.queues=b1, b2b1b2root.B.b1.capacity=20root.B.b2.capacity=80capacity: 32GBcapacity: 8GBcapacity: 100GBcapacity: 60GBcapacity: 40GBroot.A.capacity=60Capacity Scheduler Configuration52rootABb1b2capacity: 32GBavailable: 32GBused: 0%capacity: 8GBavailable: 8GBused: 0%capacity: 100GBavailable: 40GBused: 60%capacity: 60GBavailable: 0GBused: 100%capacity: 40GBavailable: 40GBused: 0%A capacity Capacity Scheduler Configuration53rootABb1b2capacity: 32GBavailable: 32GBused: 0%capacity: 8GBavailable: 0GBused: 100%capacity: 60GBavailable: 0GBused: 100%capacity: 40GBavailable: 32GBused: 20%b1 4GB job capacity: 100GBavailable: 32GBused: 72%Capacity Scheduler Configuration54rootABb1b2capacity: 32GBavailable: 24GBused: 0%capacity: 8GBavailable: 0GBused: 200%capacity: 60GBavailable: 0GBused: 100%capacity: 40GBavailable: 24GBused: 40%b1 capacity b18GB job capacity: 100GBavailable: 24GBused: 86%Capacity Scheduler Configuration55rootABb1b2capacity: 32GBavailable: 24GBused: 0%capacity: 8GBavailable: 0GBused: 200%capacity: 60GBavailable: 0GBused: 100%capacity: 40GBavailable: 24GBused: 40%b2 32GB job wait job capacity: 100