lecture 17 & 18 hadooping with aws ec2

Upload: anonymous-nkrjog

Post on 05-Jul-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    1/42

    on gur ng rsHadoop Cluster

    On Amazon EC2Harun ELKIRAN - 5525!"

    [email protected]

    Lecturer : Dr. Muhammad FAHIM

    #epartment o$ Computer EngineeringIstan%ul &' (aim )ni*ersit+, Istan%ul, ur.e+

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    2/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    3/42

    Notes / Assumptions• How to setting up a small 4 nodes Hadoop Cluster on A

    EC2 Cloud

    • I am new to Hadoop and also Linux documentation on th

    limited and text dense. "o I tr# to keep m# slides simple a

    •  $he slide assumes !asic %amiliarit# with Linux &a'a and "

    •  $he cluster will !e set up manuall# to demonstrate conce

    Hadoop. In real li%e there are lots o% con(guration manag

    tools such as )loudera *uppet )he% etc. to manage and

    automate larger clusters.

     $his slide is not production read#. +eal Hadoop clusters n

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    4/42

    Re0ap 1 3at is Hadoop

    An open source %ramework %or ,relia!le scala!le distri!utcomputing-

    It gi'es the a!ilit# process and work with large datasetsdistri%uted across clusters o% 0ommodit+ 3ard4are

    It allows to paralleli/e computation and ,mo'e processing data- using the apRedu0e %ramework.

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    5/42

    Re0ap 1 3at is Amazon EC2

    A ,cloud- we! host that allows to d#namicall# add and remcompute ser'er resources as #ou need them allowing #ou%or onl# the capacit# that #ou need

    It is well suited %or Hadoop )omputation 0 we can !ring upenormous clusters within minutes and then spin it down w

    (nished to reduce costs..

    1)2 is 3uick and cost eecti'e %or experimental and learnpurposes as well as !eing pro'en as a production Hadoop

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    6/42

    EC2Confguration

    *art 5

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    7/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    8/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    9/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    10/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    11/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    12/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    13/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    14/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    15/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    16/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    17/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    18/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    19/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    20/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    21/42

    A00ess toEC2 instan0e*art 2

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    22/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    23/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    24/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    25/42

    Apa03e

    HadoopInstallationand Cluster

    &etup

    *art 6

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    26/42

    ' Installing 6a*a

    5. 7 sudo apt9get update update the packages anddependencies;

    2. sudo add9apt9repositor# ppa:we!upda'a installlatest >a'a;

    6. sudo apt9get update ?? sudo apt9get install oracle9>dk9

    2' #o4nload Hadoop

    I am going to use haddop 5.2.5 sta

    5. wgethttp:==apache.mirror.gtcomm.nehadoop95.2.5=hadoop95.2.5.tar. download hadoop;

    2. tar 9x/'% hadoop95.2.5.tar.g/ un6. m' hadoop95.2.5 hadoop renam

    7' &etup En*ironment 8a

    I used to in")* to update the .!aimportant Hadoop paths and direct

    export HADBB*C)BFE=home=u!u

    export HADBB*C*+1FIE=home=u!G"et &AACHBM1export &AACHBM1E=usr=li!=>'m=>a'G Add Hadoop !in= director# to patexport *A$HE7*A$H:7HADBB*C*+1

    For control:

    source =.!ashrcecho 7HADBB*C*+1FIecho 7HADBB*C)BF

    http://apache.mirror.gtcomm.net/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gzhttp://apache.mirror.gtcomm.net/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gzhttp://apache.mirror.gtcomm.net/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gzhttp://apache.mirror.gtcomm.net/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    27/42

    "' &etup 9as4ord-less &&H on &er*ers

    5. Master ser'er remotel# starts ser'ices on sla'e nodeswhich re3uires password9less access to "la'e "er'ers. A"J!untu ser'er comes with pre9installed Bpen""h ser'er.

    5' Hadoop Cluster &etup

     $his section we need to modi%#

    • 3adoop-en*'s3 1 $his (le contains some en'ironment'aria!le settings used !# Hadoop. Kou can use these toaect some aspects o% Hadoop daemon !eha'ior such aswhere log (les are stored the maximum amount o% heapused etc. $he onl# 'aria!le #ou should need to change atthis point is in this (le is &AACHBM1 which speci(es thepath to the &a'a 5..x installation used !# Hadoop.

    • 0ore-site':ml 1 ke# propert# %s.de%ault.name 0 %ornamenode con(guration %or e.g hd%s:==namenode=

    • 3d$s-site':ml 1 ke# propert# 0 d%s.replication 0 !# de%ault 6

    • mapred-site':ml 1 ke# propert# mapred.>o!.tracker %or >o!tracker con(guration %or e.g >o!tracker:?

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    28/42

    Confguring aster / &la*es

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    29/42

    &tatus /Running one to+

    e:ample on m+deplo+ed s+stem

    *art 4

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    30/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    31/42

    o @ui0.l+ *eri$+ m+ sem going go to run t3adoop pi e:ample

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    32/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    33/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    34/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    35/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    36/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    37/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    38/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    39/42

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    40/42

    3at i Ha*e #oneB

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    41/42

    3at i Ha*e #oneB• "etup 1)2 re3uested machines con(gured network and passw

    ""H

    • Downloaded &a'a and Hadoop

    • )on(gured Map+educe and pushed con(guration around the c

    • "tarted Map+educe

    • )ompiled a Map+educe >o! using Hadoop *i 1xample

    "u!mitted the >o! ran it succes%ull# and 'iewed the output.• Hope%ull# i can see how this model o% computation would !e u

    'er# large datasets that we wish to per%orm processing on..

    • I-m also sold on 1)2 as a distri!uted %ast cost eecti'e plat%o

    using Hadoop %or !ig9data work.

  • 8/16/2019 Lecture 17 & 18 Hadooping With AWS EC2

    42/42

    3an. +ou

    #epartment o$ Computer EngineeringIstan%ul &' (aim )ni*ersit+, Istan%ul, ur.e+