mapreduce programming model data type: key-value records map function: (k in, v in ) list(k inter,...
TRANSCRIPT
MapReduce Programming Model• Data type: key-value records
• Map function:(Kin, Vin) list(Kinter, Vinter)
• Reduce function:(Kinter, list(Vinter)) list(Kout, Vout)
Example: Word Countdef mapper(line):
foreach word in line.split():
output(word, 1)
def reducer(key, values):
output(key, sum(values))
Word Count Execution
the quick
brown fox
the fox ate
the mouse
how now
brown cow
MapMap
MapMap
MapMap
ReduceReduce
ReduceReduce
brown, 2
fox, 2
how, 1
now, 1
the, 3
ate, 1
cow, 1
mouse, 1
quick, 1
the, 1brown, 1
fox, 1
quick, 1
the, 1fox, 1the, 1
how, 1now, 1
brown, 1
ate, 1mouse, 1
cow, 1
Input Map Shuffle & Sort Reduce Output
در ای رابطه mapreduce جبرعملگرها•
–Projection ()–Selection ()–Intersect) ( –Cartesian product ()–Set union ()–Set difference ()–Rename ()–Join) ( ⋈–Group by… aggregation–…
selection in mapreduce
• MAP:
• REDUCE:
A
131
B
XYZ
C
ttt ,t
tttttt n ,,...,,, 21
)(1 RAدرست شرط اگر
باشد
درست شرط اگرنباشد
MAP
B
XZ
C
REDUCE
(1,X,) ((1),(X, ))
(3,Y,)
(1,Z,) ((1),(Z, ))
Example:
11
A
Projection in map reduce
• MAP:
• REDUCE:
A
131
B
XYX
C
ttt ,
tttttt n ,,...,,, 21
)(, RBA
(1,X,) ((1,x),(1,X))
(3,Y,) ((3,Y),(3,Y))
(1,X,) ((1,X),(1,))
MAP REDUCE
A
13
B
XY
Example:
Union in map reduce
• MAP:
• REDUCE:
A
abc
B
213
A
bd
ttt ,
tttttt ,,...,,, 21
SR
(a,2) ((a,2),(a,2))(b,1) ((b,),(b,))
(c,3) ((c,3),(c,3))
MAP REDUCE
A
abcd
B
2134
Example:S
B
R
14
(b,1) ((b,1),(b,1))
(d,4) ((d,4),(d,4))
Intersect in map reduce
• MAP:
• REDUCE:
A
abc
B
213
A
bd
),(, RttRt SR
(a,2) ((a,2),((a,2),(R)))(b,1) ((b,),((b,),(R)))
(c,3) ((c,3),((c,3),(R)))
MAP REDUCE
A
b
B
1
Example:S
B
R
14
(b,1) ((b,1),((b,1),(S)))
(d,4) ((d,4),((d,4),(S)))
),(, SttSt )(,)),(,(, ttSRtt
))(,(, Rtt ))(,(, Stt
Natural Join Operation – Example
• Relations r, s:A B
12412
C D
aabab
B
13123
D
aaabb
E
r
A B C D E
s
r s
Natural Join Operation – Example
• Relations r, s:A B
12412
C D
aabab
B
13123
D
aaabb
E
r
A B
1
C D
a
E
s
r s
Natural Join Operation – Example
• Relations r, s:A B
12412
C D
aabab
B
13123
D
aaabb
E
r
A B
11
C D
aa
E
s
r s
Natural Join Operation – Example
• Relations r, s:A B
12412
C D
aabab
B
13123
D
aaabb
E
r
A B
111
C D
aaa
E
s
r s
Natural Join Operation – Example
• Relations r, s:A B
12412
C D
aabab
B
13123
D
aaabb
E
r
A B
1111
C D
aaaa
E
s
r s
Natural Join Operation – Example
• Relations r, s:A B
12412
C D
aabab
B
13123
D
aaabb
E
r
A B
11112
C D
aaaab
E
s
r s
Natural Join Example
R1 S1
R1 S1 =
ID sname rating age bid day
22 dustin 7 45.0 101 10/ 10/ 96 58 rusty 10 35.0 103 11/ 12/ 96
ID sname rating age
22 dustin 7 45.0
31 lubber 8 55.5 58 rusty 10 35.0
ID bid day
22 101 10/10/96 58 103 11/12/96
Reduce-side Join: many-to-many
R1
keys values
In reducer…
S2
S3
S9
Hold in memory
Cross with records from other set
R5
R8
Map-side Join: Basic Idea
Assume two datasets are sorted by the join key:
R1
R2
R3
R4
S1
S2
S3
S4
A sequential scan through both datasets to join(called a “merge join” in database terminology)
:منابع Cloud Computing with MapReduce and Hadoop
Matei ZahariaElectrical Engineering and Computer SciencesUniversity of California, Berkeley
Database and Map Reduce
Based on slides from Jimmy Lin’s lecture slides (http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/index.html) (licensed under Creation Commons Attribution 3.0 License)
Mining of Massive Datasets
Anand Rajaraman Kosmix, Inc. Jeffrey D. Ullman Stanford Univ.