mapreduce: simpliyed data processing on large clusters jeffrey dean and sanjay ghemawat to appear in...
TRANSCRIPT
![Page 1: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/1.jpg)
MapReduce: Simpliyed Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat
To appear in OSDI 2004(Operating Systems Design and Implementation)
![Page 2: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/2.jpg)
Jeff DeanSanjay Ghemawat
![Page 3: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/3.jpg)
Important programming model for large-scale data-parallel application
Introduce
![Page 4: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/4.jpg)
Motivation
- Parallel applicationsWidely usedSpecial purpose applications
- Common functionalityParallelize computationDistribute dataHandle failures
- Large Scale(Big Data) Data Processing
![Page 5: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/5.jpg)
MapReduce?
-Programming ModelParallelGenericScalable
-DataMap(Key-Value) pair
-ImplementationCommodity clusters Commodity PC
![Page 6: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/6.jpg)
# map(key, val) is run on each item in set emits new-key / new-val pairs
# reduce(key, vals) is run for each unique key emitted by map()
emits final output
MapReduce?
# User define function
![Page 7: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/7.jpg)
Example
# Distributed Grep (Global / Regular Expression / Print )
# Count of URL Access Frequency (logs of webpage request) map<URL,1(total)> reduce<URL, total count(n)>
# Reverse Web-Link Graph map<target(linked url), source(web page) reduce<target,list(source)>
![Page 8: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/8.jpg)
# Inverted Index map<word, document ID> reduce<word, list(document id)>
# Distributed Sort map<key, record> reduce<key record>(emits all pairs unchanged)
# Term-Vector per Host (<word, frequency>a list of pair) map<hostname, term vector> reduce<hostname, term vector> (throwing away infrequent terms , and emits a fi-nal)
Example
![Page 9: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/9.jpg)
Execution overview
![Page 10: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/10.jpg)
Typical cluster
# Machines are typically 100s or 1000s of 2-CPU x86 machines(dual-processor x86 proces-sors)running Linux, with 2-4 GB of memory# NetWork 100 megabits/second or 1 gigabit/second
# Storage Storage is on local IDE disks
# GFS GFS: distributed file system manages data
# Job scheduling system - jobs made up of tasks - scheduler assigns tasks to machines
# Language C++ library linked into user programs
![Page 11: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/11.jpg)
Distributed-1?
#1 - Split input file into M pieces (16M ~ 64M)(user via optional pa-rameter) - start up many copies of the program on a cluster of machines#2 - Master(1) – on e of the copies of the program is special - worker(n) – assigned work by the master - Map task(M) / Reduce tasks(R)
#3 - Map task reads the content (from input split) - pares (key/value pair) user define map function - buffered in memory
#5 Reduce workers - it uses remote procedure calls to read the buffered data from the local disks of the map workers
#4 Map workers - Periodically, the buffered pairs are written to local disk - the local disk are passed back to the master - who is responsible for forwarding these locations to the reduce workers
![Page 12: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/12.jpg)
#6 - reduce worker iterates(unique intermediate key encountered) - start up many copies of the program on a cluster of machines - The output of the Reduce function is appended to a finnal output le for this reduce partition.
Distributed-2?
#7 - When all map tasks and reduce tasks have been completed - the master wakes up the user program - the MapReduce call in the user program returns back to the user code.
#8 - After successful completion - R output files(reduce)(file names as specied by the user) - the MapReduce call in the user program returns back to the user code.
![Page 13: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/13.jpg)
Master Data Structures
#Status
Idle( 비가동 ) in-progress( 가동 ) completed( 완료 )
![Page 14: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/14.jpg)
Fault Tolerance( 결함의 허용 범위 )
#Worker Failure - The master pings every worker periodically - MapReduce is resilient to large-scale worker failures
#Master Failure mapreduce stop - It is easy to make the master write periodic checkpoints of the mas-ter data structures described above. - If the master task dies, a new copy can be started from the last checkpointed state. - Clients can check for this condition and retry the MapReduce opera-tion if they desire.#Semantics in the Presence of Failures ( 실패의 의미 )
![Page 15: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/15.jpg)
Locality( 지역성 )
#GFS 저장 네트워크 대역폭 절약 GFS divides each file into 64 MB blocks, and stores several copies of each block (typically 3 copies) on different machines.
#When running largeMapReduce operations on a signicant fraction of theworkers in a cluster, most input data is read locally andconsumes no network bandwidth.
![Page 16: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/16.jpg)
Task Granularity
# 이상적인 : Map (M) , Reduce(R) M,R > Machines - 동적 로드벨런싱 향상 - worker failure 복구시간 향상
#Master O(M+R) 개의 스캐줄링 생성 O(M+R) 개의 상태가 메모리에 유지 실질적인 허용 범위가 존재함 O(M+R) 의 상태는 최소 1byte 로 구성됨
#reduce(r) 사용자 로부터 제약을 받음 ( 각각의 시스템에서 처리 됨으로 )
#M=200,000 개 R=5,000 개 (Machines)Worker=2000 환경에서 MapReduce 연산을 수행
![Page 17: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/17.jpg)
Backup Tasks
# ”Straggler” 낙오자 Machines 전체 연산 중 가장 나중에 수행 되는 매우 처리가 오래 걸리는 map or reduce task
# When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks.
#The task is marked as completed whenever either the primary or the backup execution completes.
![Page 18: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/18.jpg)
![Page 19: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/19.jpg)
Combiner Function
Master
MapTask
MapTask
ReduceTask
ReduceTask
ReduceTask
MapTask
Network TrafficCPU Performance
N1
N3N2
![Page 20: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/20.jpg)
Status Infomation
#The master runs an internal HTTP server and exports a set of status pages for human consumption
#how many tasks have been completed
#how many are in progress, bytes of input, bytes of intermediate data, bytes of output, processing rates
# The user can use this data to predict how long the computation will take
![Page 21: MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649dde5503460f94ad67a3/html5/thumbnails/21.jpg)
Conclusions
#First, the model is easy to use, even for programmers without experi-encewith parallel and distributed systems,
# Second, a large variety of problems are easily expressible as MapRe-duce computations
# Third, we have developed an implementation of MapReduce that scales to large clusters of machines comprising thousands of machines
# First, restricting the programming model makes it easy to parallelize and distribute computations and to make such computations fault-tol-erant.
# Second, network bandwidth is a scarce resource.
# Third, redundant execution can be used to reduce the impact of slow machines, and to handle machine failures and data loss.