hadoopsdsdgs
TRANSCRIPT
-
8/9/2019 hadoopsdsdgs
1/29
-
8/9/2019 hadoopsdsdgs
2/29
B. As soon as a maer has emitte" at least one re#or".C. Not until all maers ha(e finishe" ro#essing all re#or"s.
D. t "een"s on the nutormat use" for the ob.,ns&er-
(n a large Ma"/educe *ob &ith ) )a""ers and n reducers% ho& )any distinct co"y
o"erations &ill there be in the sort5shu$$le "hase+A. m 8 n -i.e., m multilie" by n
B. nC. m
D. m9n -i.e., m lus n$. $.mn-i.e., m to the o%er of n
,ns&er- ,
For each inter)ediate key% each reducer task can e)it-A. As many final +ey/(alue airs as "esire". )here are no restri#tions on the tyes of those
+ey(alue
airs -i.e., they #an be heterogeneous.B. As many final +ey/(alue airs as "esire", but they must ha(e the same tye as the
interme"iate+ey/(alue airs.C. As many final +ey/(alue airs as "esire", as long as all the +eys ha(e the same tye an"
all the(alues ha(e the same tye.
D. :ne final +ey/(alue air er (alue asso#iate" %ith the +ey; no restri#tions on the tye.$. :ne final +ey/(alue air er +ey; no restri#tions on the tye.
,ns&er- 9
You need to )ove a $ile titled :&eblogs; into HDFS. 1hen you try to co"y the $ile%you cant.
You kno& you have a)"le s"ace on your Dataodes. 1hich action should you take
to relieve this situation and store )ore $iles in HDFS+A. n#rease the blo#+ siuen#eile #ontains a binary en#o"ing of an arbitrary number of
?ritableComarable obe#ts, in sorte" or"er.D. A =e>uen#eile #ontains a binary en#o"ing of an arbitrary number +ey/(alue airs. $a#h
+ey must be the same tye. $a#h (alue must be the same tye.,ns&er- D
-
8/9/2019 hadoopsdsdgs
3/29
1hen is the earliest "oint at &hich the reduce )ethod o$ a given /educer can be
called+
A. As soon as at least one maer has finishe" ro#essing its inut slit.
B. As soon as a maer has emitte" at least one re#or".
C. Not until all maers ha(e finishe" ro#essing all re#or"s.
D. t "een"s on the nutormat use" for the ob.,ns&er-
9#"lanation
1hich describes ho& a client reads a $ile $ro) HDFS+
A. )he #lient >ueries the NameNo"e for the blo#+ lo#ation-s. )he NameNo"e returns the
blo#+ lo#ation-s to the #lient. )he #lient rea"s the "ata "ire#tory off the DataNo"e-s.
B. )he #lient >ueries all DataNo"es in arallel. )he DataNo"e that #ontains the
re>ueste" "ata reson"s "ire#tly to the #lient. )he #lient rea"s the "ata "ire#tly off
the DataNo"e.
C. )he #lient #onta#ts the NameNo"e for the blo#+ lo#ation-s. )he NameNo"e then >ueries
the DataNo"es for blo#+ lo#ations. )he DataNo"es reson" to the NameNo"e, an" the
NameNo"e re"ire#ts the #lient to the DataNo"e that hol"s the re>ueste" "ata blo#+-s. )he
#lient then rea"s the "ata "ire#tly off the DataNo"e.
D. )he #lient #onta#ts the NameNo"e for the blo#+ lo#ation-s. )he NameNo"e #onta#ts the
DataNo"e that hol"s the re>ueste" "ata blo#+. Data is transferre" from the DataNo"e to the
NameNo"e, an" then from the NameNo"e to the #lient
,ns&er-
You are develo"ing a co)biner that takes as in"ut 'e#t keys% (nt1ritable values%
and e)its 'e#t keys% (nt1ritable values. 1hich inter$ace should your class
i)"le)ent+
A. Combiner @)e3t, nt?ritable, )e3t, nt?ritable
B. Maer @)e3t, nt?ritable, )e3t, nt?ritable
C. !e"u#er @)e3t, )e3t, nt?ritable, nt?ritable
D. !e"u#er @)e3t, nt?ritable, )e3t, nt?ritable
$. Combiner @)e3t, )e3t, nt?ritable, nt?ritable
,ns&er- D
(ndenti$y the utility that allo&s you to create and run Ma"/educe *obs &ith any
e#ecutable or scri"t as the )a""er and5or the reducer+
A. :ooo
-
8/9/2019 hadoopsdsdgs
4/29
C. lume
D. 'a"oo =treaming
$. mare"
,ns&er- D
9#"lanation-Hadoo" strea)ing is a utility that co)es &ith the Hadoo" distribution. 'he utility
allo&s you to create and run Ma"5/educe *obs &ith any e#ecutable or scri"t as the
)a""er and5or the reducer.
Ho& are keys and values "resented and "assed to the reducers during a standard
sort and shu$$le "hase o$ Ma"/educe+
A. eys are resente" to re"u#er in sorte" or"er; (alues for a gi(en +ey are not sorte".
B. eys are resente" to re"u#er in sorte" or"er; (alues for a gi(en +ey are sorte" in
as#en"ing or"er.
C. eys are resente" to a re"u#er in ran"om or"er; (alues for a gi(en +ey are not sorte".D. eys are resente" to a re"u#er in ran"om or"er; (alues for a gi(en +ey are sorte" in
as#en"ing or"er.
,ns&er- ,
9#"lanation-
!e"u#er has rimary hases
1.=huffle
)he !e"u#er #oies the sorte" outut from ea#h Maer using '))& a#ross the net%or+.
2.=ort)he frame%or+ merge sorts !e"u#er inuts by +eys -sin#e "ifferent Maers may ha(e
outut the same +ey.
)he shuffle an" sort hases o##ur simultaneously i.e. %hile oututs are being fet#he" they
are merge".
=e#on"ary=ort
)o a#hie(e a se#on"ary sort on the (alues returne" by the (alue iterator, the
ali#ation shoul" e3ten" the +ey %ith the se#on"ary +ey an" "efine a grouing #omarator.
)he +eys %ill be sorte" using the entire +ey, but %ill be groue" using the
grouing #omarator to "e#i"e %hi#h +eys an" (alues are sent in the same #all to re"u#e.
.!e"u#e
n this hase the re"u#e-:be#t, terable, Conte3t metho" is #alle" for ea#h -#olle#tion of
(alues in the sorte" inuts.
)he outut of the re"u#e tas+ is tyi#ally %ritten to a !e#or"?riter
(ia )as+nut:ututConte3t.%rite-:be#t, :be#t. )he outut of the !e"u#er is not re/
sorte".
-
8/9/2019 hadoopsdsdgs
5/29
,ssu)ing de$ault settings% &hich best describes the order o$ data "rovided to a
reducers
reduce )ethod-
A. )he +eys gi(en to a re"u#er arent in a re"i#table or"er, but the (alues asso#iate" %ith
those +eys al%ays are.
B. Both the +eys an" (alues asse" to a re"u#er al%ays aear in sorte" or"er.C. Neither +eys nor (alues are in any re"i#table or"er.
D. )he +eys gi(en to a re"u#er are in sorte" or"er but the (alues asso#iate" %ith ea#h +ey
are in no re"i#table or"er
,ns&er- D
$3lanation
!e"u#er has rimary hases
1.=huffle
)he !e"u#er #oies the sorte" outut from ea#h Maer using '))& a#ross the
net%or+.
2.=ort)he frame%or+ merge sorts !e"u#er inuts by +eys -sin#e "ifferent Maers may
ha(e outut the same +ey. )he shuffle an" sort hases o##ur simultaneously i.e. %hile
oututs are being fet#he" they are merge".
=e#on"ary=ort
)o a#hie(e a se#on"ary sort on the (alues returne" by the (alue iterator, the
ali#ation shoul" e3ten" the +ey %ith the se#on"ary +ey an" "efine a grouing #omarator.
)he +eys %ill be sorte" using the entire +ey, but %ill be groue" using the
grouing #omarator to "e#i"e %hi#h +eys an" (alues are sent in the same #all to re"u#e.
.!e"u#e
n this hase the re"u#e-:be#t, terable, Conte3t metho" is #alle" for ea#h-#olle#tion of
(alues in the sorte" inuts. )he outut of the re"u#e tas+ is tyi#ally %ritten to a!e#or"?riter (ia )as+nut:ututConte3t.%rite-:be#t, :be#t. )he outut of the !e"u#er
is not re/sorte"
You &rote a )a" $unction that thro&s a runti)e e#ce"tion &hen it encounters a
control character in in"ut data. 'he in"ut su""lied to your )a""er contains t&elve
such characters totals% s"read across $ive $ile s"lits. 'he $irst $our $ile s"lits each
have t&o control characters and the last s"lit has $our control characters.
(ndenti$y the nu)ber o$ $ailed task atte)"ts you can e#"ect &hen you run the *ob
&ith
)a"red.)a#.)a".atte)"ts set to 4-A. You %ill ha(e forty/eight faile" tas+ attemts
B. You %ill ha(e se(enteen faile" tas+ attemts
C. You %ill ha(e fi(e faile" tas+ attemts
D. You %ill ha(e t%el(e faile" tas+ attemts
$. You %ill ha(e t%enty faile" tas+ attemts
,ns&er- 9
$3lanation
-
8/9/2019 hadoopsdsdgs
6/29
)here %ill be four faile" tas+ attemts for ea#h of the fi(e file slits.
You &ant to "o"ulate an associative array in order to "er$or) a )a"0side
*oin. Youve decided to "ut this in$or)ation in a te#t $ile% "lace that $ile into
the Distributedache and read it in your Ma""er be$ore any records are
"rocessed. (ndenti$y &hich )ethod in the Ma""er you should use to i)"le)ent
code $or reading the $ile and "o"ulating the associative array+
A. #ombine
B. ma
C. init
D. #onfigure
,ns&er- D
$3lanation
=ee belo%. 'ere is an illustrati(e e3amle on ho% to use the Distribute"Ca#he
=etting u the #a#he for the ali#ation
1. Coy the re>uisite files to the ile=ystem
E binha"oo fs /#oyrom5o#al loo+u."at myaloo+u."at
E binha"oo fs /#oyrom5o#al ma.
-
8/9/2019 hadoopsdsdgs
7/29
2. =etu the ali#ationFs *obConf
*obConf ob G ne% *obConf-;
Distribute"Ca#he.a""Ca#heile-ne% H!-Imyaloo+u."atJloo+u."atI, ob;
Distribute"Ca#he.a""Ca#heAr#hi(e-ne% H!-Imyama.
-
8/9/2019 hadoopsdsdgs
8/29
hel you re"u#e the amount of "ata that nee"s to be transferre" a#ross to the re"u#ers. You
#an use your re"u#er #o"e as a #ombiner if the oeration erforme" is #ommutati(e an"
asso#iati(e.
an you use Ma"/educe to "er$or) a relational *oin on t&o large tables sharing
a key+ ,ssu)e that the t&o tables are $or)atted as co))a0se"arated $iles inHDFS.
A. Yes.
B. Yes, but only if one of the tables fits into memory
C. Yes, so long as both tables fit into memory.
D. No, Ma!e"u#e #annot erform relational oerations.
$. No, but it #an be "one %ith either &ig or 'i(e.
,ns&er- ,
$3lanation
*oin Algorithms in Ma!e"u#e
!e"u#e/si"e oin
Ma/si"e oin
You==ve &ritten a Ma"/educe *ob that &ill "rocess 2!! )illion in"ut records and
generated 2!! )illion key0value "airs. 'he data is not uni$or)ly distributed. Your
Ma"/educe *ob &ill create asigni$icant a)ount o$ inter)ediate data that it needs
to trans$er bet&een )a""ers and reduces &hich is a "otential bottleneck. ,
custo) i)"le)entation o$ &hich inter$ace is )ost likely to reduce the a)ount o$
inter)ediate data trans$erred across the net&ork+A. &artitioner
B. :ututormat
C. ?ritableComarable
D. ?ritable
$. nutormat
. Combiner
You have *ust e#ecuted a Ma"/educe *ob. 1here is inter)ediate data &ritten to
a$ter being e)itted $ro) the Ma""er==s )a" )ethod+
A. nterme"iate "ata in streame" a#ross the net%or+ from Maer to the !e"u#e an" is
ne(er %ritten to "is+.
B. nto in/memory buffers on the )as+)ra#+er no"e running the Maer that sill o(er an"
are %ritten into 'D=.
C. nto in/memory buffers that sill o(er to the lo#al file system of the )as+)ra#+er no"e
running the Maer.
-
8/9/2019 hadoopsdsdgs
9/29
D. nto in/memory buffers that sill o(er to the lo#al file system -outsi"e 'D= of the
)as+)ra#+er no"e running the !e"u#er
$. nto in/memory buffers on the )as+)ra#+er no"e running the !e"u#er that sill o(er an"
are %ritten into 'D=.
You &ant to understand )ore about ho& users bro&se your "ublic &ebsite% such
as &hich "ages they visit "rior to "lacing an order. You have a $ar) o$ 3!! &eb
servers hosting your &ebsite. Ho& &ill you gather this data $or your analysis+
1. ngest the ser(er %eb logs into 'D= using lume.
2. ?rite a Ma!e"u#e ob, %ith the %eb ser(ers for maers, an" the 'a"oo
#luster no"es forre"u#es.
. mort all usersFF #li#+s from your :5)& "atabases into 'a"oo, using =>oo.
4. Channel these #li#+streams inot 'a"oo using 'a"oo =treaming.
. =amle the %eblogs from the %eb ser(ers, #oying them into 'a"oo using
#url.
Ma!e"u#e (2 -M!(2YA!N is "esigne" to a""ress %hi#h t%o issuesP
A. =ingle oint of failure in the NameNo"e.
B. !esour#e ressure on the *ob)ra#+er.
C. 'D= laten#y.
D. Ability to run frame%or+s other than Ma!e"u#e, su#h as M&.
$. !e"u#e #omle3ity of the Ma!e"u#e A&s.
. =tan"ar"i
-
8/9/2019 hadoopsdsdgs
10/29
Ma""er $or a given *ob.
1. )he +ey an" (alue tyes se#ifie" in the *obConf.setManuteyClass
an" *obConf.setManutOaluesClass metho"s
2. )he "ata tyes se#ifie" in 'AD::&QMA&QDA)A)Y&$= en(ironment (ariable. )he maer/se#ifi#ation.3ml file submitte" %ith the ob "etermine the
maerFFs inut +ey an" (alue tyes.
4. )he nutormat use" by the ob "etermines the maerFFs inut +ey an"
(alue tyes.
(denti$y the Ma"/educe v3 M/v3 5 Y,/7 dae)on res"onsible $or launching
a""lication containers and )onitoring a""lication resource usage+
A. !esour#eManager
B. No"eManager
C. Ali#ationMaster
D. Ali#ationMaster=er(i#e
$. )as+)ra#+er
. *ob)ra#+er
1hich best describes ho& 'e#t(n"utFor)at "rocesses in"ut $iles and line breaks+
A. nut file slits may #ross line brea+s. A line that #rosses file slits is rea" by the
!e#or"!ea"er of the slit that #ontains the beginning of the bro+en line.
B. nut file slits may #ross line brea+s. A line that #rosses file slits is rea" by
the !e#or"!ea"ers of both slits #ontaining the bro+en line.
C. )he inut file is slit e3a#tly at the line brea+s, so ea#h !e#or"!ea"er %ill rea" a series
of #omlete lines.
D. nut file slits may #ross line brea+s. A line that #rosses file slits is ignore".
$. nut file slits may #ross line brea+s. A line that #rosses file slits is rea" by the
!e#or"!ea"er of the slit that #ontains the en" of the bro+en line.
For each in"ut key0value "air% )a""ers can e)it-
A. As many interme"iate +ey/(alue airs as "esigne". )here are no restri#tions on the tyes
of those +ey/(alue airs -i.e., they #an be heterogeneous.
B. As many interme"iate +ey/(alue airs as "esigne", but they #annot be of the same tye
as the inut +ey/(alue air.
C. :ne interme"iate +ey/(alue air, of a "ifferent tye.
D. :ne interme"iate +ey/(alue air, but of the same tye.
$. As many interme"iate +ey/(alue airs as "esigne", as long as all the +eys ha(e the same
tyes an" all the (alues ha(e the same tye.
You have the $ollo&ing key0value "airs as out"ut $ro) your Ma" task-
the% 7
$o#% 7
-
8/9/2019 hadoopsdsdgs
11/29
$aster% 7
than% 7
the% 7
dog% 7
Ho& )any keys &ill be "assed to the /educer==s reduce )ethod+
A. =i3B. i(e
C. our
D. )%o
$. :ne
. )hree
You have user "ro$ile records in your >?@' database% that you &ant to *oin &ith
&eb logs you have already ingested into the Hadoo" $ile syste). Ho& &ill you
obtain these user records+
A. 'D= #omman"
B. &ig 5:AD #omman"C. =>oo imort
D. 'i(e 5:AD DA)A #omman"
$. ngest %ith lume agents
. ngest %ith 'a"oo =treaming
1hich t&o u"dates occur &hen a client a""lication o"ens a strea) to begin a $ile
&rite on a cluster running Ma"/educe v M/v7+
A. :n#e the %rite stream #loses on the DataNo"e, the DataNo"e imme"iately initiates a
bla#+ reort to the NameNo"e.B. )he #hange is %ritten to the NameNo"e "is+.
C. )he meta"ata in the !AM on the NameNo"e is flushe" to "is+.
D. )he meta"ata in !AM on the NameNo"e is flushe" "is+.
$. )he meta"ata in !AM on the NameNo"e is u"ate".
. )he #hange is %ritten to the e"its file.
Ans%er
For a Ma"/educe *ob% on a cluster running Ma"/educe v M/v7% &hat==s the
relationshi" bet&een tasks and task te)"lates+
A. )here are al%ays at least as many tas+ attemts as there are tas+s.
B. )here are al%ays at most as many tas+s attemts as there are tas+s.
C. )here are al%ays e3a#tly as many tas+ attemts as there are tas+s.
D. )he "e(eloer sets the number of tas+ attemts on ob submission.
1hat action occurs auto)atically on a cluster &hen a Dataode is )arked as
dead+
-
8/9/2019 hadoopsdsdgs
12/29
A. )he NameNo"e for#es re/reli#ation of all the blo#+s %hi#h %ere store" on the "ea"
DataNo"e.
B. )he ne3t time a #lient submits ob that re>uires blo#+s from the "ea" DataNo"e, the
*ob)ra#+er re#ei(es no heart beats from the DataNo"e. )he *ob)ra#+er tells the NameNo"e
that the DataNo"e is "ea", %hi#h triggers blo#+ re/reli#ation on the #luster.
C. )he reli#ation fa#tor of the files %hi#h ha" blo#+s store" on the "ea" DataNo"e istemorarily re"u#e", until the "ea" DataNo"e is re#o(ere" an" returne" to the #luster.
D. )he NameNo"e informs the #lient %hi#h %rite the blo#+s that are no longer a(ailable; the
#lient then re/%rites the blo#+s to a "ifferent DataNo"e.
Ho& does the a)eode kno& Dataodes are available on a cluster running
Ma"/educe vM/v7
A. DataNo"es liste" in the "fs.hosts file. )he NameNo"e uses as the "efiniti(e list of
a(ailable DataNo"es.
B. DataNo"es heartbeat in the master on a regular basis.
C. )he NameNo"e broa"#asts a heartbeat on the net%or+ on a regular basis, an" DataNo"es
reson".D. )he NameNo"e sen" a broa"#ast a#ross the net%or+ %hen it first starts, an" DataNo"es
reson".
1hich three distc" $eatures can you utilize on a Hadoo" cluster+
A. Hse "ist# to #oy files only bet%een t%o #lusters or more. You #annot use "ist# to #oy
"ata bet%een "ire#tories insi"e the same #luster.
B. Hse "ist# to #oy 'Base table files.
C. Hse "ist# to #oy hysi#al blo#+s from the sour#e to the target "estination in your
#luster.
D. Hse "ist# to #oy "ata bet%een "ire#tories insi"e the same #luster.
$. Hse "ist# to run an internal Ma!e"u#e ob to #oy files.
Ho& does HDFS Federation hel" HDFS Scale horizontally+
A. 'D= e"eration imro(es the resilien#y of 'D= in the fa#e of net%or+ issues by
remo(ing the NameNo"e as a single/oint/of/failure.
B. 'D= e"eration allo%s the =tan"by NameNo"e to automati#ally resume the ser(i#es of
an a#ti(e NameNo"e.
C. 'D= e"eration ro(i"es #ross/"ata #enter -non/lo#al suort for 'D=, allo%ing a
#luster a"ministrator to slit the Blo#+ =torage outsi"e the lo#al #luster.
D. 'D= e"eration re"u#es the loa" on any single NameNo"e by using the multile,
in"een"ent NameNo"e to manage in"i(i"ual ars of the filesystem namesa#e.
hoose &hich best describe a Hadoo" cluster=s block size storage "ara)eters once
you set the HDFS de$ault block size to 64MB+
A. )he blo#+ si
-
8/9/2019 hadoopsdsdgs
13/29
1hich Ma"/educe dae)on instantiates user code% and e#ecutes )a" and reduce
tasks on a cluster running Ma"/educe v M/v7+
A. NameNo"e
B. DataNo"e
C. *ob)ra#+er
D. )as+)ra#+er$. !esour#eManager
. Ali#ationMaster
7. No"eManager
1hat t&o "rocesses )ust you do i$ you are running a Hadoo" cluster &ith a single
a)eode and si# Dataodes% and you &ant to change a con$iguration "ara)eter
so that it a$$ects all si# Dataodes.
A. You must restart the NameNo"e "aemon to aly the #hanges to the #luster
B. You must restart all si3 DataNo"e "aemons to aly the #hanges to the #luster.
C. You "onFt nee" to restart any "aemon, as they %ill i#+ u #hanges automati#ally.
D. You must mo"ify the #onfiguration files on ea#h of the si3 DataNo"e ma#hines.$. You must mo"ify the #onfiguration files on only one of the DataNo"e ma#hine
. You must mo"ify the #onfiguration files on the NameNo"e only. DataNo"es rea" their
#onfiguration from the master no"es.
(denti$y the $unction "er$or)ed by the Secondary a)eode dae)on on a cluster
con$igured to run &ith a single a)eode.
A. n this #onfiguration, the =e#on"ary NameNo"e erforms a #he#+oint oeration on the
files by the NameNo"e.
B. n this #onfiguration, the =e#on"ary NameNo"e is stan"by NameNo"e, rea"y to failo(er
an" ro(i"e high a(ailability.
C. n this #onfiguration, the =e#on"ary NameNo"e erforms "eal/time ba#+us of theNameNo"e.
D. n this #onfiguration, the =e#on"ary NameNo"e ser(ers as alternate "ata #hannel for
#lients to rea#h 'D=, shoul" the NameNo"e be#ome too busy.
You install loudera Manager on a cluster &here each host has AB o$ /,M. ,ll o$
the services sho& their status as concerning. Ho&ever% all *obs sub)itted
co)"lete &ithout an error. 1hy is loudera Manager sho&ing the concerning
status M the services+
A. A sla(e no"eFs "is+ ran out of sa#e
B. )he sla(e no"es, ha(enFt sent a heartbeat in 60 minutes
C. )he sla(e no"es are s%aing.D. DataNo"e ser(i#e instan#e has #rashe".
1hat is the reco))ended disk con$iguration $or slave nodes in your Hadoo"
cluster &ith 6 # 3 'B hard drives+
A. !AD 10
B. *B:D
C. !AD
-
8/9/2019 hadoopsdsdgs
14/29
D. !AD 190
You con$igure you cluster &ith HDFS High ,vailability H,7 using Cuoru)0Based
storage. You do not i)"le)ent HDFS Federation.
1hat is the )a#i)u) nu)ber o$ a)eodes dae)on you should run on you
cluster in order to avoid a ==s"lit0brain== scenario &ith your a)eodes+A. Hnlimite". 'D= 'igh A(ailability -'A is "esigne" to o(er#ome limitations on the number
of
NameNo"es you #an "eloy.
B. )%o a#ti(e NameNo"es an" one =tan"by NameNo"e
C. :ne a#ti(e NameNo"e an" one =tan"by NameNo"e
D. )%o a#ti(e NameNo"es an" t%o =tan"by NameNo"es
You con$igure Hadoo" cluster &ith both Ma"/educe $ra)e&orks% Ma"/educe v
M/v7 and Ma"/educe v3 M/v35Y,/7. 1hich t&o Ma"/educe co)"utational7
dae)ons do you need to con$igure to run on your )aster nodes+
A. *ob)ra#+erB. !esour#eManager
C. Ali#ationMaster
D. *ournalNo"e
$. No"eManager
You observe that the nu)ber o$ s"illed records $ro) )a" tasks $or e#ceeds the
nu)ber o$ )a" out"ut records. You child hea" size is AB and your io.sort.)b
value is set to !!MB. Ho& &ould you tune your io.sort.)b value to achieve
)a#i)u) )e)ory to disk (5> ratio+
A. )une io.sort.mb (alue until you obser(e that the number of sille" re#or"s e>uals -or isas #lose
to e>uals the number of ma outut re#or"s.
B. De#rease the io.sort.mb (alue belo% 100MB.
C. n#rease the :.sort.mb as high you #an, as #lose to 17B as ossible.
D. or 17B #hil" hea si
-
8/9/2019 hadoopsdsdgs
15/29
1hat ha""ens &hen client tries to &rite a $ile to5re"orts5)yre"ort.t#t+
A. )he file su##essfully %rites to usersreortsmyreortsmyreort.t3t.
B. )he #lient thro%s an e3#etion.
C. )he file su##essfully %rites to reortmyreort.t3t. )he meta"ata for the file is manage"
by the first NameNo"e to %hi#h the #lient #onne#ts.
D. )he file %rites fails silently; no file is %ritten, no error is reorte".
(denti$y t&o $eatures5issues that Ma"/educe v3 M/v35Y,/7 is designed to
address-
A. !esour#e ressure on the *ob)ra#+r
B. 'D= laten#y.
C. Ability to run frame%or+s other than Ma!e"u#e, su#h as M&.
D. !e"u#e #omle3ity of the Ma!e"u#e A&s.
$. =ingle oint of failure in the NameNo"e.
. =tan"ar"i
-
8/9/2019 hadoopsdsdgs
16/29
&hat is M,@ /9D9+
Ma !e"u#e is a set of rograms use" to a##ess an" maniulate large "ata sets o(er a
'a"oo #luster.
1hat is the (n"utS"lit in )a" reduce so$t&are+
An inutslit is the sli#e of "ata to be ro#esse" by a single Maer. t generally is of the
blo#+ si
-
8/9/2019 hadoopsdsdgs
17/29
:n Master No"e Name No"e an" *ob )ra#+er an" =e#on"ary name no"e
:n =la(e Data No"e an" )as+ )ra#+er
But its re#ommen"e" to run =e#on"ary name no"e in a searate ma#hine %hi#h ha(e
Master no"e #aa#ity.
1hat is co)"ute and Storage nodes+
"o "efine 'a"oo into 2 %ays
Distribute" &ro#essing Ma / !e"u#e
Distribute" =torage 'D=
Name No"e hol"s Meta info an" Data hol"s e3a#t "ata an" its M! rogram.
9#"lain ho& in"ut and out"ut data $or)at o$ the Hadoo" $ra)e&ork+
ileinutformat, te3tinutformat, +ey(aluete3tinutformat, se>uen#efileinutformat,
se>uen#efileasinutte3tformat, %holefileformat are file formats in ha"oo frame%or+
Ho& can &e control "articular key should go in a s"eci$ic reducer+
By using a #ustom artitioner.
1hat is the /educer used $or+
!e"u#er is use" to #ombine the multile oututs of maer to one.
1hat are the "ri)ary "hases o$ the /educer+
!e"u#er has rimary hases shuffle, sort an" re"u#e.
1hat ha""ens i$ nu)ber o$ reducers are !+
t is legal to set the number of re"u#e/tas+s to
-
8/9/2019 hadoopsdsdgs
18/29
an ( set the nu)ber o$ reducers to zero+
#an be gi(en as
-
8/9/2019 hadoopsdsdgs
19/29
A. No, 'a"oo "oes not ro(i"e te#hni>ues for #ustom "atatyes.
B. Yes, but only for maers.
C. Yes, #ustom "ata tyes #an be imlemente" as long as they imlement %ritable
interfa#e.
D. Yes, but only for re"u#ers.
Ans%er C
'he Hadoo" ,@( uses basic 8ava ty"es such as ?ong1ritable% 'e#t% (nt1ritable.
'hey have al)ost the sa)e $eatures as de$ault *ava classes. 1hat are these
&ritable data ty"es o"ti)ized $or+
A. ?ritable "ata tyes are se#ifi#ally otimi
-
8/9/2019 hadoopsdsdgs
20/29
A. )he "istribute" #a#he is se#ial #omonent on nameno"e that %ill #a#he fre>uently use"
"ata for faster #lient resonse. t is use" "uring re"u#e ste.
B. )he "istribute" #a#he is se#ial #omonent on "atano"e that %ill #a#he fre>uently use"
"ata for faster #lient resonse. t is use" "uring ma ste.
C. )he "istribute" #a#he is a #omonent that #a#hes a(a obe#ts.
D. )he "istribute" #a#he is a #omonent that allo%s "e(eloers to "eloy ars for Ma/!e"u#e ro#essing.
Ans%er D
an you run Ma" 0 /educe *obs directly on ,vro data in Hadoo"+
A. Yes, A(ro %as se#ifi#ally "esigne" for "ata ro#essing (ia Ma/!e"u#e
B. Yes, but a""itional e3tensi(e #o"ing is re>uire"
C. No, A(ro %as se#ifi#ally "esigne" for "ata storage only
D. A(ro se#ifies meta"ata that allo%s easier "ata a##ess. )his "ata #annot be use" as
art of ma/re"u#e e3e#ution, rather inut se#ifi#ation only.
Ans%er A
1hat is ,E/> in Hadoo"+
A. A(ro is a a(a seriali
-
8/9/2019 hadoopsdsdgs
21/29
Ans%er C
1hat are the co))on "roble)s &ith )a"0side *oin in Hadoo"+
A. )he most #ommon roblem %ith ma/si"e oins is intro"u#ing a high le(el of #o"e#omle3ity. )his #omle3ity has se(eral "o%nsi"es in#rease" ris+ of bugs an" erforman#e
"egra"ation. De(eloers are #autione" to rarely use ma/si"e oins.
B. )he most #ommon roblem %ith ma/si"e oins is la#+ of the a(aialble ma slots sin#e
ma/si"e oins re>uire a lot of maers.
C. )he most #ommon roblems %ith ma/si"e oins are out of memory e3#etions on sla(e
no"es.
D. )he most #ommon roblem %ith ma/si"e oin is not #learly se#ifying rimary in"e3 in
the oin. )his #an lea" to (ery slo% erforman#e on large "atasets.
Ans%er C
Ho& can you over&rite the de$ault in"ut $or)at in Hadoo"+
A. n or"er to o(er%rite "efault inut format, the 'a"oo a"ministrator has to #hange
"efault settings in #onfig file.
B. n or"er to o(er%rite "efault inut format, a "e(eloer has to set ne% inut format on
ob #onfig before submitting the ob to a #luster.
C. )he "efault inut format is #ontrolle" by ea#h in"i(i"ual maer an" ea#h line nee"s to
be arse" in"i(u"ually.
D. None of these ans%ers are #orre#t.
Ans%er B
1hat is the de$ault in"ut $or)at in Hadoo"+
A. )he "efault inut format is 3ml. De(eloer #an se#ify other inut formats as
aroriate if 3ml is not the #orre#t inut.
B. )here is no "efault inut format. )he inut format al%ays shoul" be se#ifie".
C. )he "efault inut format is a se>uen#e file format. )he "ata nee"s to be rero#esse"
before using the "efault inut format.
D. )he "efault inut format is )e3tnutormat %ith byte offset as a +ey an" entire line as
a (alue.
Ans%er D
1hy &ould a develo"er create a )a"0reduce &ithout the reduce ste" Hadoo"+
A. De(eloers shoul" "esign Ma/!e"u#e obs %ithout re"u#ers only if no re"u#e slots are
a(ailable on the #luster.
B. De(eloers shoul" ne(er "esign Ma/!e"u#e obs %ithout re"u#ers. An error %ill o##ur
-
8/9/2019 hadoopsdsdgs
22/29
uon #omile.
C. )here is a C&H intensi(e ste that o##urs bet%een the ma an" re"u#e stes. Disabling
the re"u#e ste see"s u "ata ro#essing.
D. t is not ossible to #reate a ma/re"u#e ob %ithout at least one re"u#e ste. A
"e(eloer may "e#i"e to limit to one re"u#er for "ebugging uroses.
Ans%er C
Ho& can you disable the reduce ste" in Hadoo"+
A. )he 'a"oo a"ministrator has to set the number of the re"u#er slot to ue to eliminate "ata from initial "ata set at re"u#e ste
B. !e"u#e/si"e oin is a te#hni>ue for merging "ata from "ifferent sour#es base" on a
se#ifi# +ey. )here are no memory restri#tions
C. !e"u#e/si"e oin is a set of A& to merge "ata from "ifferent sour#es.
D. None of these ans%ers are #orre#t
Ans%er B
1hat is )a" 0 side *oin in Hadoo"+
A. Ma/si"e oin is "one in the ma hase an" "one in memory
B. Ma/si"e oin is a te#hni>ue in %hi#h "ata is eliminate" at the ma ste
-
8/9/2019 hadoopsdsdgs
23/29
C. Ma/si"e oin is a form of ma/re"u#e A& %hi#h oins "ata from "ifferent lo#ations
D. None of these ans%ers are #orre#t
Ans%er A
Ho& can you use binary data in Ma"/educe in Hadoo"+
A. Binary "ata #an be use" "ire#tly by a ma/re"u#e ob. :ften binary "ata is a""e" to a
se>uen#e file.
B. Binary "ata #annot be use" by 'a"oo freme%or+. Binary "ata shoul" be #on(erte" to a
'a"oo #omatible format rior to loa"ing.
C. Binary #an be use" in ma/re"u#e only %ith (ery limite" fun#tionlity. t #annot be use"
as a +ey for e3amle.
D. 'a"oo #an freely use binary files %ith ma/re"u#e obs so long as the files ha(e
hea"ers
Ans%er A
1hat are )a" $iles and &hy are they i)"ortant in Hadoo"+
A. Ma files are store" on the nameno"e an" #ature the meta"ata for all blo#+s on a
arti#ular ra#+. )his is ho% 'a"oo is Ira#+ a%areI
B. Ma files are the files that sho% ho% the "ata is "istribute" in the 'a"oo #luster.
C. Ma files are generate" by Ma/!e"u#e after the re"u#e ste. )hey sho% the tas+
"istribution "uring ob e3e#ution
D. Ma files are sorte" se>uen#e files that also ha(e an in"e3. )he in"e3 allo%s fast "ata
loo+ u.
Ans%er D
1hat are seuen#e files are binary format files that are #omresse" an" are slitable. )hey are
often use" in high/erforman#e ma/re"u#e obs
B. =e>uen#e files are a tye of the file in the 'a"oo frame%or+ that allo% "ata to be
sorte"
C. =e>uen#e files are interme"iate files that are #reate" by 'a"oo after the ma ste
D. Both B an" C are #orre#t
Ans%er A
Ho& )any states does 1ritable inter$ace de$ines in Hadoo"+
A. )%o
B. our
C. )hree
-
8/9/2019 hadoopsdsdgs
24/29
-
8/9/2019 hadoopsdsdgs
25/29
C. Combine
Ans A
1hat is the in"ut to the /educe $unction in Hadoo"+
A. :ne +ey an" a list of all (alues asso#iate" %ith that +ey.
B. :ne +ey an" a list of some (alues asso#iate" %ith that +ey.
C. An arbitrarily si
-
8/9/2019 hadoopsdsdgs
26/29
B. alse
Ans B
1hich Ma"/educe stage serves as a barrier% &here all "revious stages )ust be
co)"leted be$ore it )ay "roceed+
A. Combine
B. 7rou -a.+.a. FshuffleF
C. !e"u#e
D. ?rite
Ans A
1hich ', resource has su""ort $or Hadoo" Ma"/educe+
A. !angerB. 5onghorn
C. 5onestar
D. =ur
Ans A
1hich o$ the $ollo&ing scenarios )akes HDFS unavailable in Hadoo"+
A. *ob)ra#+er failure
B. )as+)ra#+er failure
C. DataNo"e failureD. NameNo"e failure
$. =e#on"ary NameNo"e failure
Ans%er A
1hich ', resource has su""ort $or Hadoo" Ma"/educe in Hadoo"+
A. !anger
B. 5onghorn
C. 5onestar
D. =ur
Ans A
1hich Ma"/educe stage serves as a barrier% &here all "revious stages )ust be
co)"leted be$ore it )ay "roceed in Hadoo"+
A. Combine
-
8/9/2019 hadoopsdsdgs
27/29
B. 7rou -a.+.a. FshuffleF
C. !e"u#e
D. ?rite
Ans A
1hich o$ the $ollo&ing scenarios )akes HDFS unavailable in Hadoo"+
A. *ob)ra#+er failure
B. )as+)ra#+er failure
C. DataNo"e failure
D. NameNo"e failure
$. =e#on"ary NameNo"e failure
Ans%er A
You are running a Hadoo" cluster &ith all )onitoring $acilities "ro"erlycon$igured. 1hich scenario &ill go undetected in Hadoo"+
A. Ma or re"u#e tas+s that are stu#+ in an infinite loo.
B. 'D= is almost full.
C. )he NameNo"e goes "o%n.
D. A DataNo"e is "is#onne#te"from the #luster.
$. Ma!e"u#e obs that are #ausing e3#essi(e memory s%as.
Ans%er C
1hich o$ the $ollo&ing utilities allo&s you to create and run Ma"/educe *obs &ithany e#ecutable or scri"t as the )a""er and5or the reducer+
A. :ooo
C. lume
D. 'a"oo =treaming
Ans%er D
You need a distributed% scalable% data Store that allo&s you rando)% realti)e
read5&rite access to hundreds o$ terabytes o$ data. 1hich o$ the $ollo&ing &ouldyou use in Hadoo"+
A. 'ue
B. &ig
C. 'i(e
D. :o
-
8/9/2019 hadoopsdsdgs
28/29
. lume
7. =>oo
Ans%er $
1ork$lo&s e#"ressed in >ozie can contain in Hadoo"+
A. terati(e reetition of Ma!e"u#e obs until a "esire" ans%er or state is rea#he".
B. =e>uen#es of Ma!e"u#e an" &ig obs. )hese are limite" to linear se>uen#es of a#tions
%ith e3#etion han"lers but no for+s.
C. =e>uen#es of Ma!e"u#e obs only; no &ig or 'i(e tas+s or obs. )hese Ma!e"u#e
se>uen#es #an be #ombine" %ith for+s an" ath oins.
D. =e>uen#es of Ma!e"u#e an" &ig. )hese se>uen#es #an be #ombine" %ith other a#tions
in#lu"ing for+s, "e#ision oints, an" ath oins.
Ans%er D
You have an e)"loyee &ho is a Date ,nalyst and is very co)$ortable &ith SC?. He
&ould like to run ad0hoc analysis on data in your HDFS duster. 1hich o$ the
$ollo&ing is a data &arehousing so$t&are built on to" o$ ,"ache Hadoo" that
de$ines a si)"le SC?0like oo
$. :o
-
8/9/2019 hadoopsdsdgs
29/29
(n a Ma"/educe *ob% you &ant each o$ you in"ut $iles "rocessed by a single )a"
task. Ho& do you con$igure a Ma"/educe *ob so that a single )a" task "rocesses
each in"ut $ile regardless o$ ho& )any blocks the in"ut $ile occu"ies+
A. n#rease the arameter that #ontrols minimum slit siual to the number of inut files you %ant to ro#ess.
D. ?rite a #ustom ilenutormat an" o(erri"e the metho" is=littable to al%ays return
false.
Ans%erB
1hich o$ the $ollo&ing best describes the &orkings o$ 'e#t(n"utFor)at in
Hadoo"+
A. nut file slits may #ross line brea+s. A line that#rosses tile slits is ignore".
B. )he inut file is slit e3a#tly at the line brea+s, so ea#h !e#or" !ea"er %ill rea" a seriesof #omlete lines.
C. nut file slits may #ross line brea+s. A line that #rosses file slits is rea" by the
!e#or"!ea"ers of both slits #ontaining the bro+enline.
D. nut file slits may #ross line brea+s. A line that #rosses file slits is rea" by the
!e#or"!ea"er of the slit that #ontains the en" of the bro+enline.
$. nut file slits may #ross line brea+s. A line that #rosses file slits is rea" by the
!e#or"!ea"er of the slit that #ontains the beginningof thebro+en line.
Ans%er D