4 hive tutorial d03

115
7/21/2019 4 Hive Tutorial d03 http://slidepdf.com/reader/full/4-hive-tutorial-d03 1/115 © www.BitBootCamp.com © www.BitBootCamp.com HIVE

Upload: gyan-sharma

Post on 05-Mar-2016

58 views

Category:

Documents


0 download

DESCRIPTION

4 Hive Tutorial d03

TRANSCRIPT

Page 1: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 1/115

© www.BitBootCamp.com© www.BitBootCamp.com

HIVE

Page 2: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 2/115

© www.BitBootCamp.com© www.BitBootCamp.com 2

Overview of Hadoop Training

Unix

Introduction to Hadoop

Hive Working with Hive Cro ta! "uerie with Hive

#ecommendation $ngine

Page 3: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 3/115

© www.BitBootCamp.com© www.BitBootCamp.com %

Coure O!&ective

How Hive augments MapReduce

How to create tables and manipulate data using Hive

Advanced features of Hive

Page 4: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 4/115

© www.BitBootCamp.com© www.BitBootCamp.com '

Course Chapters

Introduction to Hive

(etting )ata Into Hive

*anipu+ating )ata in Hive

,artitioning and Bucketing

 -dvanced Hive

Page 5: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 5/115

© www.BitBootCamp.com© www.BitBootCamp.com

Introduction to Hive

Page 6: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 6/115

© www.BitBootCamp.com© www.BitBootCamp.com

*otivation for Hive

Easy to write Map Reduce job

Built for Non-Programmers – no Java

Page 7: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 7/115© www.BitBootCamp.com© www.BitBootCamp.com /

*ap#educe The Cha++enge

*ap #educe i written in 0-1-

#e"uire a good undertanding of

0-1-

The *ap #educe ,aradigm

The Hadoop -,I

The Buine pro!+em at hand

Page 8: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 8/115© www.BitBootCamp.com© www.BitBootCamp.com

Origin of Hive

)eve+oped at 3ace!ook

Open ource pro&ect at -pache foundation

4anguage !aed on 564

)ec+arative in nature

Page 9: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 9/115© www.BitBootCamp.com© www.BitBootCamp.com 7

What i Hive 8

Code generator for 564 tatement 9: *ap #educe &o!

Convert the Hive 564 to 0ava *ap #educe

5u!mit the code to the c+uter 

)ip+a; the reu+t to the uer 

Hive -dvantage 564 i much eaier than 0-1-

Writing e"uiva+ent *ap #educe code i much fater 

 - +ot of peop+e a+read; know 564

Page 10: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 10/115© www.BitBootCamp.com© www.BitBootCamp.com =>

Hive v. 0ava *# comparion

5e+ect ? from Ta!+eT 0oin Ta!+eB on @ Ta!+eT.a A Ta!+eB.a

Page 11: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 11/115© www.BitBootCamp.com© www.BitBootCamp.com ==

Hive i not a #e+ationa+ )ata!ae

#e+ation )ata!ae *anagement ;tem

Thouand of imu+taneou c+ient 1er; fat repone time

5upport for Tranaction @ -CI)

5upport for Update tatement

Hive i not #)B*5 It wi++ not make the Hadoop c+uter into data!ae

It !aica++;D convert the Hive64 to *ap#educe &o!

It wi++ take ome time to execute

Eou wi++ never do @ 5e+ect ? from ta!+eT

 @covered +ater H)35 Import command

Page 12: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 12/115© www.BitBootCamp.com© www.BitBootCamp.com =2

Hive v. #e+ationa+ )B

Page 13: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 13/115© www.BitBootCamp.com© www.BitBootCamp.com =%

Getting Data into Hive

The Hive architecture

How to create ta!+e in Hive

)ifferent co+umn t;pe

Importing data into Hive

*u+tip+e Hive )ata!ae

Page 14: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 14/115© www.BitBootCamp.com© www.BitBootCamp.com

Getting Data into HIVE

Page 15: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 15/115© www.BitBootCamp.com© www.BitBootCamp.com =F

How Hive work

)ata i tored in the H)35

3o+der and 3i+e

Hive wi++ +a;er the ta!+e definition on top of the 3o+derG3i+e 3o+der Ta!+e

3i+e Content in the ta!+e

Ta!+e ))4D define the +a;out of the fi+e Co+umn <ameD and t;pe

Co+umn and #ow 5eparator @ C51D T51D etc.. )efau+t eparator Contro+9- Char 

1ia 5er)eD thi can !e changed

Hive *eta95tore contain a++ thi information

Page 16: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 16/115© www.BitBootCamp.com© www.BitBootCamp.com =

Hive *eta 5tore

Hive *eta 5tore i tored in a et of ta!+e

)er!; @ )efau+t @ing+e uer *;5"+ can !e configured to tore thi meta data @mu+ti uer

*eta 5tore inc+ude the fo++owing

Ta!+e ))4 Ta!+e <ameD Co+umn <ameD )ata T;peD etc

4ocation of data in H)35 ;tem Interna+ GuerGhiveGwarehoue

$xterna+ -n; +ocation in the H)35 ;tem

#ow and co+umn eparator 5torage format Ued !; *ap9#educe which wi++ govern the

Input3ormat and Output3ormat

Page 17: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 17/115© www.BitBootCamp.com© www.BitBootCamp.com =/

5u!mit Hive 6uer; to C+uter 

Hive Interpreter

Convert the Hive64 code to *ap9#educe &o!

0o! 5u!miion 0o! wi++ !e u!mit to the c+uter 

Option for <um!er of *apperG#educer wi++ !e entD etc..

Page 18: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 18/115© www.BitBootCamp.com© www.BitBootCamp.com =

4aunch Hive

There are three wa; we can +aunch hive

Hive Command 4ine Interface hive

   Hive:

4aunch Hive64 from command +ine Hive Je K e+ect ? from ta!+eT +imit =>L

4aunch Hive6+ from 5cript.564 fi+e

Hive Jf 5cript.564

Hive command mut !e terminated !; K M L

Page 19: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 19/115© www.BitBootCamp.com© www.BitBootCamp.com =7

Getting Data into Hive

The Hive architecture

How to create ta!+e in Hive

)ifferent co+umn t;pe

Importing data into Hive

*u+tip+e Hive )ata!ae

Page 20: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 20/115© www.BitBootCamp.com© www.BitBootCamp.com 2>

)ata Ta!+e in Hive

5;ntax

Create Ta!+e t @ Co+<ame T;peD N #ow 3ormat )e+imited

3ie+d Terminated !; CharP

5tored a Text3i+e Q 5e"uence3i+eP

It wi++ create a u!9director; t at

GuerGhiveGwarehoueG  in H)35

GuerGhiveGwarehoueG  Hive Warehoue )irector;

Page 21: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 21/115© www.BitBootCamp.com© www.BitBootCamp.com 2=

Create Ta!+e in )etai+

Create able t ! Col"ame #pe$ % &

4it the name of the ta!+e

4it the name of the co+umn and data T;pe

Row 'ormat Delimited

Te++ Hive thatD data fie+d are de+imited !; ome char 

'ields erminated b# (Char)

5pecifie the de+imited char @ KDLD KRtL

)efau+t i Contro+9- char SR>>=

*tored as ( e+t'ile , *e-uence'ile) 4a;out of the fi+eD if the fi+e i text fi+e

5e"uence fi+e i the Hadoop !inar; fi+e +a;outD )efau+tP

Page 22: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 22/115

© www.BitBootCamp.com© www.BitBootCamp.com 22

He+p on Ta!+e )efinition

5imp+e 1iew

dec t

)etai+ 1erion

dec extended t

Thi information i dip+a;ed from hive meta data

Page 23: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 23/115

© www.BitBootCamp.com© www.BitBootCamp.com 2%

$xterna+ ta!+e

If the data i contained outide of Warehoue fo+der 

GuerGhiveGwarehoue Thi wi++ !e ca++ed $xterna+ ta!+e

)ata i ti++ tored in H)35

$xamp+eCreate $xterna+ Ta!+e externa+ta!+e

@ c= tringD c2 arra;Vtring:D c% int

#ow format de+imited

Co++ection item terminated !; KDL

5tored a textfi+e

.ocation /0user0menish0e+ternal1table2

Page 24: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 24/115

© www.BitBootCamp.com© www.BitBootCamp.com 2'

)e+eting ta!+e

Ta!+e can !e dropped !;

)rop Ta!+e t

Interna+ Ta!+e  -++ *eta data i de+eted

 -++ data i +ot

$xterna+ Ta!+e  -++ *eta )ata i +ot

E+ternal data is not deleted Data director# is not deleted

Page 25: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 25/115

© www.BitBootCamp.com© www.BitBootCamp.com 2F

 -+tering the Ta!+e )efinition

Change the ta!+e definition

Change ta!+e +ocation Change co+umn definition

#ename ta!+eGupdate propertie

 -dd and remove partition @ 4ater more on partition

$xamp+e

A.ER A3.E t *E .4CAI4" 5new_location6

A.ER A3.E t ADD C4.7M"* !col_name, type$ 888&

 A.ER A3.E t RE"AME 4 x  A.ER A3.E t CHA"GE old_name new_name new_type

A.ER A3.E t DR49 9ARII4" ! part_col=:val :&

A.ER A3.E t ADD 9ARII4" ! part_col=:val :&

Page 26: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 26/115

© www.BitBootCamp.com© www.BitBootCamp.com 2

Getting Data into Hive

The Hive architecture

How to create ta!+e in Hive

)ifferent co+umn t;pe

Importing data into Hive

*u+tip+e Hive )ata!ae

Page 27: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 27/115

© www.BitBootCamp.com© www.BitBootCamp.com 2/

)ata T;pe in Hive

Hive data t;peD map to 0ava ,rimitive t;pe

5tandard )ata T;pe Integer

TI<EI<T 9 = !;te integer 

5*-44I<T 9 2 !;te integer 

I<T 9 ' !;te integer 

BI(I<T 9 !;te integer 

Boo+ean t;pe BOO4$-< 9 T#U$G3-45$

3+oating point num!er 34O-T 9 ing+e preciion

)OUB4$ 9 )ou!+e preciion

5tring t;pe 5T#I<( 9 e"uence of character in a pecified character et

Page 28: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 28/115

© www.BitBootCamp.com© www.BitBootCamp.com 2

)ata T;pe What *iing

<o )efau+t t;pe for

)ate or Time Work -round

4everage 5tring T;pe

Bui+t in function to manage )ate or Time

<o Binar; co+umn t;pe B+o! etc..

Can not tore )ocument a !inar; J 4everage ome ort of )ocument data!ae for thi purpoe

Hive i !ui+t to manageD text data

<ote Overa++ fi+e can !e in !inar;M we &ut can not mix the text and!inar; data in one ta!+e.

Hive i a continuou !ui+dD future re+eae wi++ have thee

$T- TB)

Page 29: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 29/115

© www.BitBootCamp.com© www.BitBootCamp.com 27

$xamp+e of a Hive Ta!+e

$xamp+e

Create ta!+e ngram @word tringD ;ear intD wfre" intD !fre" int#ow 3ormat )e+imited

3ie+d terminated !; KRtL

5tored a Text3i+eM

$ach command in Hive need to !e terminated !; K M L

#ow 3ormat )e+imited Hive to expect one record per +ine

4ine are determined !; KRnRrL

3ie+d Terminated !;.. Co+umn are eparated !; ta! KRtL

Page 30: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 30/115

© www.BitBootCamp.com© www.BitBootCamp.com %>

Comp+ex )ata T;pe

3o++owing comp+ex data t;pe are upported

*ap @ $"uiva+ent to Hah9ta!+e e;D 1a+ue ,air

 -rra; 4it of $+ement

5truct Uer defined 5tructure

Thee are ued to tore 0ava O!&ect or 05O< o!&ect

Page 31: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 31/115

© www.BitBootCamp.com© www.BitBootCamp.com %=

*ap )ata T;pe

*ap

e; J 1a+ue ,air Ca++ed !; referencing the ke;

Uage

*-, V primitiveT;peD an;t;pe : $xamp+e

*ap V 5tringD 5tring:

*ap V intD 5truct :

Uer +ogin and ,aword

Uer+ogin a *ap T;peM uerid i ke; in intD and pa i va+ue intring

Create ta!+e ,aword

Uer+ogin *-,VintD 5tring:D N P

Page 32: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 32/115

© www.BitBootCamp.com© www.BitBootCamp.com %2

 -rra; )ata T;pe

 -rra; )ata T;pe

$+ement of ame t;pe a a +it $+ement are acceed !; index

Uage  -rra;Van;t;pe:

$xamp+e $mai+addr V 5tring:

)etai+ 4it of uer emai+ -ddre

XK3irt<ameYdomain.comLD K3irt<ame.4at<ameYdomain.comLZ

3irt<ameYdomain.com i acceed !; ca++ing $mai+addrX=Z

Create ta!+e emai+  emai+addr -rra; V5tring:D N P

5t t

Page 33: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 33/115

© www.BitBootCamp.com© www.BitBootCamp.com %%

5truct

5truct )ata T;pe

*ix of )ata e+ement $+ement are acceed !; dot K.L notation

Uage 5tructV5tringan;t;peD N :

$xamp+e uer 5truct Vuerid intD name5tringD emai+5tring :

$xamp+e $mp+o;ee ta!+e with detai+

Create ta!+e emp+o;ee

$mpdec 5tructVHigh5choo+5tringD Co++ege5tring :

N P

) +i it f C + ) t T

Page 34: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 34/115

© www.BitBootCamp.com© www.BitBootCamp.com %'

)e+imiter for Comp+ex )ata T;pe

 -rra; and 5truct

Collections Items terminated b# char Create ta!+e t=

@ c= tringD c2 arra;Vtring:D c% int

#ow format de+imited

3ie+d Terminated !; SRt

Co++ection item terminated !; KDL5tored a textfi+eM

3i+e wi++ contain the fo++owing data

Uer= K=LDL2LDL%L =>>

Uer2 aD!Dc =72Uer% xD;D[ 2>>>

*ap J e; 1a+ue pairMap ;e#s erminated b# char 

G tti D t i t Hi

Page 35: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 35/115

© www.BitBootCamp.com© www.BitBootCamp.com %F

Getting Data into Hive

The Hive architecture

How to create ta!+e in Hive

)ifferent co+umn t;pe

Importing data into Hive

*u+tip+e Hive )ata!ae

4 di ) t i Hi

Page 36: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 36/115

© www.BitBootCamp.com© www.BitBootCamp.com %

4oading )ata in Hive

To +oad data in hiveD &ut move the data to the

correponding director; in H)35

*u+tip+e wa; to +oad data Hadoop f Jmv GpathGtoG+oca+fi+e GuerGhiveG

warehoueGta!+enameG

Hdf df Jcop;from+oca+ GpathGtoG+oca+fi+e GuerGhiveGwarehoueGta!+enameG

Hive: 4oad data inpath SGpathGtoGdirGinGhdf into ta!+e t

Hive: +oad data local inpath SSGpathGtoGdirG+oca+ into ta!+e t

Overwrite the fi+e 4oad data inpath SpathGtoGdir overwrite into ta!+e t

Create new ta!+e on the f+; Inert overwrite ta!+e t2 e+ect ? from t=M

( tti ) t t f Hi

Page 37: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 37/115

© www.BitBootCamp.com© www.BitBootCamp.com %/

(etting )ata out of Hive

Inert overwrite )irector;

Output "uer; reu+t in H)35 dir  4oca+ option wi++ !ring data to +oca+ fo+der 

)ata i eria+i[ed a text

Co+umn are eparated !; Ctr+9- character R>>=

#ow !; new+ine char 

(ood to get +arge amount of data

$xamp+e Inert overwrite +oca+ director; SpathGtoG+oca+Gfi+e.dat e+ect ? from tM

Thi t t h t f

Page 38: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 38/115

© www.BitBootCamp.com© www.BitBootCamp.com %

Thing to watch out for 

)rop ta!+e

 -++ data i +otD no wa; to get it !ack

There i not #o++!ack or Undo

$xterna+ Ta!+e It &ut a pointer to H)35 fo+der outide of warehoue fo+der 

1er; he+pfu+D for exiting dataD not need to move the data around

)ropping i afeD a on+; metadata i de+eted

4 di d t f $ i ti d t !

Page 39: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 39/115

© www.BitBootCamp.com© www.BitBootCamp.com %7

4oading data from $xiting data!ae

We a++ need to import data from mu+tip+e data!ae to

Hadoop

Ue 5"oop  Open 5ource code

*;5645"oop Hadoop

1ia 0)BC

It wi++ connect to an; data!aeD o +ong ;ou have the 0)BC driver

5 C t

Page 40: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 40/115

© www.BitBootCamp.com© www.BitBootCamp.com '>

5"oop Concept

4aunch *ap #educe &o! to +oad data

*u+tip+e connection to data!ae to pu++ data )efau+t ' connection

$ach connection work in para++e+D and data i imported fater 

$ach connection work on a different part of the data

0)BC &ut +ike O)BC connection

Create the mapping fi+eD from ource to detination

!aed on the ource meta data Can create hive ta!+e without an; pecific config

5"oop 5 nta

Page 41: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 41/115

© www.BitBootCamp.com© www.BitBootCamp.com '=

5"oop 5;ntax

5"oop import R

99uername user R99paword pass R99connect &d!cm;"+GGdbserver.example.com/db R 99hive9import R

99fie+d9terminated9!; \Rt\ R 99ta!+e t1

Getting Data into Hive

Page 42: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 42/115

© www.BitBootCamp.com© www.BitBootCamp.com '2

Getting Data into Hive

The Hive architecture

How to create ta!+e in Hive

)ifferent co+umn t;pe

Importing data into Hive

*u+tip+e Hive )ata!ae

*u+tip+e )ata!ae

Page 43: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 43/115

© www.BitBootCamp.com© www.BitBootCamp.com '%

*u+tip+e )ata!ae

)efau+t data!ae name i defau+t

5how data!ae 4it the data!ae in the ;tem

Create data!ae d! Create a new data!ae on the warehoue dir 

Ue d!

5how ta!+e

5how ta!+e from d!

Course Chapters

Page 44: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 44/115

© www.BitBootCamp.com© www.BitBootCamp.com ''

Course Chapters

(etting )ata Into Hive

*anipu+ating )ata in Hive

,artitioning and Bucketing

 -dvanced Hive

Manipulating Data in Hive

Page 45: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 45/115

© www.BitBootCamp.com© www.BitBootCamp.com 'F

Manipulating Data in Hive

5e+ect 5tatement

0oin

5toring reu+t in H)35 or +oca+

Baic 3unction

5u! -dvanced Hive

O!&ective

Page 46: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 46/115

© www.BitBootCamp.com© www.BitBootCamp.com '

O!&ective

In thi chapter ;ou wi++ +earn how to ue Hive to "uer; data

on HadoopM a++ the &o! wi++ +aunch map9reduce &o! inome capacit;

Ue e+ect tatement

0oin 5tore reu+t in Hdf

Ue defau+t function

Hive64

Page 47: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 47/115

© www.BitBootCamp.com© www.BitBootCamp.com '/

Hive64

5u!et of 564972 5tandard

5upported 5e+ectD 0oinD aggregate and u! "uerie <o 5upport Update or )e+ete

Hive ha ome extenion

,artitioning 5amp+ing

Comp+ex data t;pe @ -rra;D *apD 5truct  - we aw !efore

Uer )efined function  man; +anguage are upported

*u+ti ta!+e inert  do more with one command

Baic 5e+ect 5;ntax

Page 48: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 48/115

© www.BitBootCamp.com© www.BitBootCamp.com '

Baic 5e+ect 5;ntax

5e+ect 5;ntax

5e+ect exp=D exp2D exp%.. 3rom ta!+eT

$xp can Co+umn nameD 3unctionD cutom 3unction

3rom i re"uired

Hive i not cae enitive ,+eae ue the !et coding practice Came+ cae or Upper Cae

-+ia in 5e+ect

Page 49: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 49/115

© www.BitBootCamp.com© www.BitBootCamp.com '7

 -+ia in 5e+ect

 -+iae are upported in Hive

"o as <e# word needed5e+ect exp=D exp2D .. 3rom ta!+ename t

Ued ifD we have common name acro ta!+e

5e+ect t.exp=D t.exp2 from ta!+ename t

3i+ter 4imiting the <um!er of #ecord

Page 50: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 50/115

© www.BitBootCamp.com© www.BitBootCamp.com F>

3i+ter 4imiting the <um!er of #ecord

3i+ter the reu+t !; uing where c+aue

5e+ect exp=D exp2D exp%.. 3rom ta!+eT

where condition

Condition i an; Boo+ean expreion Condition can com!ined with -ndGor D @

4imit C+aue @ 5ame a Top

5e+ect exp=D exp2D exp%.. 3rom ta!+eTwhere condition

4imit n

Order !; 5ort !; etc

Page 51: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 51/115

© www.BitBootCamp.com© www.BitBootCamp.com F=

Order !; 5ort !; ..etc

There are mu+tip+e wa; to ortD due to para++e+ nature of

Hadoop

Order B; Overa++ ort acro a++ the mapper 5ing+e reducer i ued to ort the data

Can !e ver; +ow ,+eae ue the +imit c+aue

On !ig dataetD it can !+ow9up a ing+e node

5ort !; 5orting i on+; +oca+ to given reducer  Ue mu+tip+e reducer to ort the data

5ort i +oca+ to reducerD hence it guarantee on per reducer 

*a; reu+t in partia+ ordered et

)itri!ute !;

Page 52: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 52/115

© www.BitBootCamp.com© www.BitBootCamp.com F2

)itri!ute !;

)itri!ute !;

)itri!ute the data !aed on the ke; to the ame reducer  <o guarantee on c+utering or orting propertie

Uefu+D if data need to !e partitioned for a given reducer 

3or examp+e e+ect c=D c2 from t ditri!ute !; c=

Ta!+e =

=D a

2D !%D c=Dd%D e'D f 

=Da%De%Dc=Dd

2D!'Df 

#educer =

#educer 2

=Da%De%Dc

=Dd2D!'Df 

3ina+ output

C+uter !;

Page 53: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 53/115

© www.BitBootCamp.com© www.BitBootCamp.com F%

C+uter !;

C+uter !;

)itri!ute !; ] 5ort !; in one command #ow with ame ke; i ditri!uted to ame reducer 

)ata i orted per reducer 

$xamp+e e+ect c=D c2 from t c+uter!; c=

Ta!+e =

=D a

2D !%D c=Dd%D e'D f 

=Da%De%Dc=Dd

2D!'Df 

#educer =

#educer 2

=Da=Dd%Dc

%De2D!'Df 

3ina+ output

(roup !;

Page 54: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 54/115

© www.BitBootCamp.com© www.BitBootCamp.com F'

(roup !;

(roup !; i ued to aggregate data

5e+ect tateD count@ditinct retid

3rom ret

(roup !; tate

*u+tip+e -ggregation in one tatement

5e+ect tateD count@ditinct retid D count@? cnt

3rom ret

(roup !; tate

Manipulating Data in Hive

Page 55: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 55/115

© www.BitBootCamp.com© www.BitBootCamp.com FF

Manipulating Data in Hive

5e+ect 5tatement

0oin

5toring reu+t in H)35 or +oca+

Baic 3unction

5u! -dvanced Hive

0oining ta!+e

Page 56: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 56/115

© www.BitBootCamp.com© www.BitBootCamp.com F

0oining ta!+e

It ued to &oin ta!+e

5upport Inner 0oin

4eft Outer 0oin

#ight Outer 0oin

3u++er Outer 0oin

<ot a++ condition are upported  -.id A !.id i upported

 -.id V: !.id not upported

5;ntax

Page 57: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 57/115

© www.BitBootCamp.com© www.BitBootCamp.com F/

5;ntax

Hive ue the fu++ expreed verion of the ;ntax

5e+ect co+= from t= inner &oin t2 on t=.co+= A t2.co+2

5hortcut verion of &oin are not upported 5e+ect co+= from t=D t2 where N

0oin

Page 58: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 58/115

© www.BitBootCamp.com© www.BitBootCamp.com F

0oin

Inner 0oin

4eft Outer0oin

#ight outer &oin

3u++ outer 0oin

nu++

nu++

Manipulating Data in Hive

Page 59: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 59/115

© www.BitBootCamp.com© www.BitBootCamp.com F7

Manipulating Data in Hive

5e+ect 5tatement

0oin

5toring reu+t in H)35 or +oca+

Baic 3unction

*u+tip+e Hive )ata!ae

Outputting data to H)35

Page 60: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 60/115

© www.BitBootCamp.com© www.BitBootCamp.com >

Outputting data to H)35

To dump data on H)35 with a given format

$xterna+ Ta!+e Ue Inert tatement to Overwrite ta!+e

Create $xterna+ Ta!+e T @ co+=D co+2D.. #ow 3ormat de+imited

3ie+d terminated !; KDL

5tored a Text3i+e

4ocation KGpathGtoGhdfLM

Inert overwrite ta!+e e+ect ? from N

*u+ti9ta!+e inert

Page 61: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 61/115

© www.BitBootCamp.com© www.BitBootCamp.com =

*u+ti ta!+e inert

 - e+ect tatement can run for a +ong time

Can we do ome proceing in para++e+

3rom @ e+ect co+=D co+2D co+% from .. a+ia

Inert overwrite ta!+e t= e+ect co+= from a+iaInert overwrite ta!+e t2 e+ect count@? from a+ia

We have to define the tructure of t= and t2 prior 

Manipulating Data in Hive

Page 62: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 62/115

© www.BitBootCamp.com© www.BitBootCamp.com 2

Manipulating Data in Hive

5e+ect 5tatement

0oin

5toring reu+t in H)35 or +oca+

Baic 3unction

5u! -dvanced Hive

Hive 3unction

Page 63: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 63/115

© www.BitBootCamp.com© www.BitBootCamp.com %

Hive 3unction

Bui+t in function

*ath )ate

Condition

5tring

 -ggregate

There i ver; powerfu+ upport for Uer defined function

3unction can !e written in an; +anguage

Ued to hide the !uine +ogic

4everage Thrift to make thi magic happen

*ath 3unction

Page 64: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 64/115

© www.BitBootCamp.com© www.BitBootCamp.com '

*ath 3unction

5tandard *ath function

#and@ 4og@ 5"rt @ cei+@ N 5tring 3unction

4ength@ concat@ u!tring@ upper@ +ower@ trim@ etc..

)ate 3unction Unixtimetamp@ fromunixtimetamp@ ;ear@ month@ datediff@ ..

  @Ue EEEE9**9)) format

 -ggregate function *in@ max@ um@ tddevpop@ tddevamp+e@ etcN

<eed to ue group !; for aggregate to work

Manipulating Data in Hive

Page 65: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 65/115

© www.BitBootCamp.com© www.BitBootCamp.com F

Manipulating Data in Hive

5e+ect 5tatement

0oin

5toring reu+t in H)35 or +oca+

Baic 3unction

5u! -dvanced Hive

Creating ta!+e from exiting data

Page 66: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 66/115

© www.BitBootCamp.com© www.BitBootCamp.com

C ea g a! e o e g da a

Create ta!+e from e+ect tatement

Create ta!+e ta!+ename a@e+ect co+=D co+2D co+% from exitingta!+e

<o need to define the metadata from ta!+ename

Co+umn definition wi++ !e inherited from exiting ta!+e

5u! 6uer;

Page 67: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 67/115

© www.BitBootCamp.com© www.BitBootCamp.com /

6 ;

Hive upport 5u!6uerie

5e+ect co+= from@ e+ect co== ] co+22 a co+= from t= t2

5u!9"uer; ta!+e mut !e given a name

One can have a man; neted "uerie a poi!+e..

 -dvice )o not ue itD a it get ver; hard to de!ug..

1iew

Page 68: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 68/115

© www.BitBootCamp.com© www.BitBootCamp.com

5;ntaxCreate 1iew v @ co+=D co+2..

 - e+ect co+=D co+2 N

Co+umn name in 1 i optiona+D it wi++ take it from e+ecttatement

1iew are not materia+i[ed  on ta!+e def.

Cannot !e ued to inert data etc..

Order !; and 4imit c+aue are upported

To de+ete )rop view v

Union -++

Page 69: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 69/115

© www.BitBootCamp.com© www.BitBootCamp.com 7

Com!ine data from mu+tip+e ta!+e

The name of the co+umn in ta!+e mut !e identica+

On+; Union -++ i upportedD

Union i not upportedD a it wi++ do the dedupe

5e+ect co+= from ta!+e t=

Union a++

5e+ect co+=2 a co+= from ta!+e2

*ap9ide v. #educe9ide 0oin

Page 70: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 70/115

© www.BitBootCamp.com© www.BitBootCamp.com />

p

There are two t;pe of 0oin

*ap 5ide 0oin 0oin wi++ happen on the mapper 

*uch faterD !ut on+; work for one ma++ ta!+e

<eed to give a hint to Hive to make thi worke+ect 0=> MA9?4I"!b& =0 count@? from

 - &oin ! on @-.a= A !.a=

B i a ma++ ta!+e

hive.auto.convert.&oinAtrue

#educe ide &oin 0oin wi++ happen on the reducer 

 -++ the data wi++ !e hipped a++ over 

Can !e ver; +ow

*ap 0oin What <ot 5upported

Page 71: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 71/115

© www.BitBootCamp.com© www.BitBootCamp.com /=

p pp

The fo++owing i not upported. Union 3o++owed !; a *ap0oin

4atera+ 1iew 3o++owed !; a *ap0oin

#educe 5ink @(roup B;G0oinG5ort B;GC+uter B;G)itri!ute B;3o++owed !; *ap0oin

*ap0oin 3o++owed !; Union

*ap0oin 3o++owed !; 0oin *ap0oin 3o++owed !; *ap0oin

#educe 5ide 0oin

Page 72: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 72/115

© www.BitBootCamp.com© www.BitBootCamp.com /2

#educe ide &oin Wi++ do fu++ outer &oin

It wi++ convert to inner &oin at reducer 

$ach mapper wi++ ee a +ice of data

*apper wi++ ee the &oin ke;D and on+; thoe pair are emited

#educer wi++ &oin the data

It a mu+ti tep map9reduce &o! (ood

,ara++e+ 0oinD and work at ca+e

Bad

)ata wi++ !e hipped to a++ over the network Wated c;c+eD where few matche are poi!+e

3atet Hive 6uer;

Page 73: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 73/115

© www.BitBootCamp.com© www.BitBootCamp.com /%

;

In order of priorit;

*eta data on+; )ecri!e tM

H)35 read 5e+ect ? from t +imit =>M

*ap on+; 5e+ect ? from f where co+=AKva+ueL

#educe 5e+ect count@? from t

*u+tip+e *ap9#educe 5e+ect ? from t &oin t2 on @t.a A t=.a ort !; t.a

Course Chapters

Page 74: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 74/115

© www.BitBootCamp.com© www.BitBootCamp.com /'

p

(etting )ata Into Hive

*anipu+ating )ata with Hive

,artitioning and Bucketing

 -dvanced Hive

9artitioning and 3uc<eting

Page 75: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 75/115

© www.BitBootCamp.com© www.BitBootCamp.com /F

g g

,artitioning )ata

Bucketing )ata

,artitioning )ata

Page 76: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 76/115

© www.BitBootCamp.com© www.BitBootCamp.com /

,artitioning data i e"uiva+ent to Hori[onta+ p+it of dataD!aed on a co+umn va+ue

)ata i tored in u!9director; of main ta!+e

Work +ike an index on data

$xamp+e Where monthAS0an on+; read the 0an fo+der

Both tatic and d;namic partition are upported

4og )ata

Hive )ata!ae,artition m; +oaddate

GurGhiveGwarehoueG+ogG0an

GurGhiveGwarehoueG+ogG3e!

GurGhiveGwarehoueG+ogG*arch

5tatic ,artition 5;ntax

Page 77: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 77/115

© www.BitBootCamp.com© www.BitBootCamp.com //

,artition !; 5;ntaxCreate ta!+e t @ co+= intD ..

9artitioned b# ! column"ame datat#pe&

#ow 3ormat )e+N

$xamp+e

Create Ta!+e 4og @ UerI) intD Httptring 5tring

,artitioned !; @ +oaddate tring

#ow 3ormat de+imited

3ie+d Terminated !; KRtL<ote

,artition co+umn i a K1irtua+ Co+umnL

)ata doe not exit in the incoming dataD it pecified !; the uer 

It act +ike a rea+ co+umn in the fina+ ta!+e

5tatic ,artition $xamp+e

Page 78: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 78/115

© www.BitBootCamp.com© www.BitBootCamp.com /

5;ntax to +oad data4oad data inpath SpathGtoGta!+e

Into ta!+e t

,artition @ co+Ava+ue

$xamp+e

4oad )ata inpath SGuerGmenihG+og

Into ta!+e 4og

,artition @+oaddate AS>=9>=92>='

3i+e +ocationGurGhiveGwarehoueG+ogG+oaddateA=9=92>='

);namic ,artition

Page 79: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 79/115

© www.BitBootCamp.com© www.BitBootCamp.com /7

If the partition !; data a+read; exit in the taging ta!+e

3rom 5tagingta!+e 5

Inert overwrite ta!+e t ,artition @ partitionco+

e+ect .co+=D .co+2D .co+%$ s8partition1col @

,artition are automatica++; createdD !aed on the va+uein co+umn <ew partition wi++ !e created

O+d partition wi++ !e overwritten

5et the fo++owing command to et the partition 5et Hive.exec.d;namic.partitionAtrueM

5et hive.exec.max.created.fi+eA =>>>>>M

);namic ,artition Contro+

Page 80: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 80/115

© www.BitBootCamp.com© www.BitBootCamp.com >

<ote If the partition co+umn have man; va+ueD that man;ditinct u!fo+der wi++ !e created

Ue the fo++owing command to +imitD run awa; code hive.exec.max.d;namic.partition.pernode

*ax num!er of d;namic partition to !e created

)efau+t =>>

hive.exec.max.d;namic.partition Tota+ num!er of partition created per Hive64 code

)efau+t =>>>

hive.exec.max.created.fi+e *ax num!er of fi+e created !; mapper and reducer

)efau+t =>>D>>>

Watch out for temp data createdM it can !+ow up the c+uter.̂ ^^

5u!9,artition

Page 81: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 81/115

© www.BitBootCamp.com© www.BitBootCamp.com =

Ta!+e can contain u! partition Eear 

*onth

Create ta!+e t @ co+pecification

,artition @ CreateEear 5tringD Create*onth 5tring Createmonth wi++ !e the u!director; under Create;ear 

The d;namic co+umn partD mut !e in the end of thee+ect +it

 -+ter Ta!+e command can !e ued to drop or add partition

9artitioning and 3uc<eting

Page 82: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 82/115

© www.BitBootCamp.com© www.BitBootCamp.com 2

,artitioning )ata

Bucketing )ata

Bucketing )ata

Page 83: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 83/115

© www.BitBootCamp.com© www.BitBootCamp.com %

5imi+ar to ,artition data

)ata i partitioned !; hah va+ue of a partition !; co+umn

$ffective in even ditri!ution of data acro thenodeGfo+der

Can !e ued to amp+e the data If we need a random amp+e to work on the data

Bucketed Ta!+e 5;ntax

Page 84: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 84/115

© www.BitBootCamp.com© www.BitBootCamp.com '

Create ta!+e t @ co+pecificationC+uter !; @ co+ into n Bucket

<ote

5ince Hahva+ue are ued to ditri!ute the dataD p+eae inure thatwe have event ditri!ution of data in c+uter !; co+umn

Inert data in !ucketed ta!+e 3irt inert the data in the taging ta!+e

Hive: et mapred.reduce.takA@num!er9of9!ucket

Hive : et hive.enfore.!ucketingAtrueM

Hive : Inert overwrite ta!+e fina+!ucketta!+e

5e+ect ? from tagingta!+e

Bucketing at High 4eve+

Page 85: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 85/115

© www.BitBootCamp.com© www.BitBootCamp.com F

Inert overwrite ta!+efina+!ucketta!+e

5e+ect ? fromtagingta!+e

*ap@

*ap@

*ap@

#educe@

#educe@

#educe@

Hahing

Bucket=

Bucket2

Bucket%

Hive

5amp+ing )ata

Page 86: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 86/115

© www.BitBootCamp.com© www.BitBootCamp.com

To amp+e the dataD ue the fo++owing

5e+ect ? from !ucketedta!+eTa!+e5amp+e @ !ucket = out of F on co+

If ta!+e ha F !ucketD then it wi++ return the data from !ucket =

If ta!+e ha 2> !ucketD then it wi++ return data from !ucket =D D ==D =

We do not need to !ucket ta!+e to get a amp+eM we can ue omeother a+goM

HoweverD without !ucketingD a fu++ ta!+e can wi++ !e needed

Course Chapters

Page 87: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 87/115

© www.BitBootCamp.com© www.BitBootCamp.com /

(etting )ata Into Hive

*anipu+ating )ata with Hive

,artitioning and Bucketing

 -dvanced Hive

Advanced Hive

Page 88: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 88/115

© www.BitBootCamp.com© www.BitBootCamp.com

Hive 1aria!+e

Hive Command 4ine Interface

Thrift and Hive

Tranform

Uer defined function

5er)e

Hive 1aria!+e

Page 89: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 89/115

© www.BitBootCamp.com© www.BitBootCamp.com 7

1aria!+e in the cript Hive : et varnameAva+ueM

5e+ect ? from ta!+e where co+AhiveconfvarnamePM

1aria!+e from outide of cript Hive Jhiveconf varnameAva+ueM

hive Jhiveconf ;earA2>=' Je 

K4oad data infi+e StmpG;earP into +og partitionA;earPL

<ote R AA K <ew +ine in unixD it a continuation of code K

Advanced Hive

Page 90: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 90/115

© www.BitBootCamp.com© www.BitBootCamp.com 7>

Hive 1aria!+e

Hive Command 4ine Interface

Thrift and Hive

Tranform

Uer defined function

5er)e

Command 4ine Interface C4I Hive

Page 91: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 91/115

© www.BitBootCamp.com© www.BitBootCamp.com 7=

Hive ha command +ine interfaceD to run "uerie ininteractive mode

3o++owing are the command +ine option

Interactive *ode

Page 92: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 92/115

© www.BitBootCamp.com© www.BitBootCamp.com 72

Hive run in the interactive without the option for 9e K In+ine 6uerieL

9f K Code fi+e K

Comment 99

 -++ code +ineD need to end with M

Command V"uer; : M  run the "uer;

$xitM or 6uitM  exit the interactive mode

5et  +it out a++ the varia!+e in the hive environment

5et Jv  a++ poi!+e tandard varia!+e in the ;tem

 -dd fi+e Vva+ue: add the +it of reource to the environment 4it V3i+eQ0arQarchive:  earch the t;pe of reource

)f  execute a df command from hive

 -dding #eource to Hive

Page 93: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 93/115

© www.BitBootCamp.com© www.BitBootCamp.com 7%

Hive can add reource to the eion  -n; +oca+ fi+e

 -++ the added reourceD wi++ !e ditri!uted to the hadoop on a++ thenode

 -dd fi+e Q 0ar Q -rchive P Vfi+e,ath: ?

4it fi+e Q 0ar Q -rchive P Vfi+e,ath: ?

)e+ete fi+e Q 0ar Q -rchive P Vfi+e,ath: ?

$xamp+e

Hive : add fi+e Gm;codeGcode.p;M

Hive : +it fi+eM GtmpGcode.p;

Advanced Hive

Page 94: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 94/115

© www.BitBootCamp.com© www.BitBootCamp.com 7'

Hive 1aria!+e

Hive Command 4ine Interface

Thrift and Hive

Tranform

Uer defined function

5er)e

What i -pache Thrift8

Page 95: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 95/115

© www.BitBootCamp.com© www.BitBootCamp.com 7F

Thrift i a g+ue !etween man; +anguage

#,C @ #emote ,rocedure Ca++ are ued to ca++ thefunction from other +anguage

 -n; +anguageD which upport #,C can !e ued in Thrift

)eve+oped at 3ace!ook

Cro +anguage eria+i[ation with +ower overhead

 -++ow for the ue of the other +anguage C]]D 0avaD ,;thonD ,H,D #u!;D C_D ,er+D 0ava5cript.. $tc

 -++ow ue of definition fi+e

Inner Working of Thrift

Page 96: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 96/115

© www.BitBootCamp.com© www.BitBootCamp.com 7

Create a ?.thrift to dec+are o!&ect and procedure Thee wi++ !e ued to communicate !etween Hive and other

+anguage

$xecute the Thrift Too+ (enerate the Thrift p+atform code from ;our +anguage of choice

Create C+ient 5erver -pp+ication

Thrift wi++ createD tranport c+aeD define o!&ect and function

#un the Hive 5erver 

<ote Comp+ex and need a +ot of hacking to make it work

Advanced Hive

Page 97: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 97/115

© www.BitBootCamp.com© www.BitBootCamp.com 7/

Hive 1aria!+e

Hive Command 4ine Interface

Thrift and Hive

Tranform

Uer defined function

5er)e

Uing Tranform

Page 98: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 98/115

© www.BitBootCamp.com© www.BitBootCamp.com 7

Hive a++ow the uer to ue cripting +anguage to modif;the data Can ue an; cripting +anguage

5;ntax Hive : add Gm;codeGcript.p+M

Hive: Inert overwrite ta!+e reu+t

5e+ect tranform@t.? uing SGm;codeGcript.p+L a @co+=D co+2

3rom @ e+ect ? N tM

Tranform

Page 99: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 99/115

© www.BitBootCamp.com© www.BitBootCamp.com 77

Input to 5cript )ata i received a ta! eparated va+ue

Output of 5cript )ata i emitted a ta! eparated va+ue

5cript can !e !ui+t in an; +anguageD o +ong a the c+uterha the parerGinterpreter for it  -dd command wi++ move the cript to a++ the node on the c+uter 

Cutom *ap and #educe 5cript

Page 100: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 100/115

© www.BitBootCamp.com© www.BitBootCamp.com =>>

Tranform a++ow for cutom map and reduce cript

3rom @

3rom uer

*ap uer.uidD D uer.date

uing Smapide5criptLa dtD id

c+uter !; dt mapoutput

Inert overwrite ta!+e uerreduce

reduce mapoutput.dtD mapoutput.iduing Sreducecript

a dateD countM

Output t;pe of tranform

Page 101: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 101/115

© www.BitBootCamp.com© www.BitBootCamp.com =>=

)efau+t the output fie+d wi++ !e on convert to 5tring andde+imited !; KRtL <u++ va+ue wi++ !e convert to 4itera+ R<

)efau+t can changed !; #ow 3ormat

5e+ect Tranform@ ..

Uing S5ome5criptL

 - Co+=D Co+2

5e+ect Tranform@ N

Uing S5ome5criptM

 - @Co+= intD co+2 int

$xamp+e

Page 102: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 102/115

© www.BitBootCamp.com© www.BitBootCamp.com =>2

)ata Count the num!er of Word in the !ook

)ata i tored a one +ine per record

*apper 4eve+ 5p+it the word from each +ine

$mit the word and = a cnt @wordD=

5end the data for each word to the ame reducer !; uing c+uter !;

#educer  Count the num!er of word

$xamp+e ,;thon 5cript

Page 103: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 103/115

© www.BitBootCamp.com© www.BitBootCamp.com =>%

Mapperscript.py 

#!/usr/bin/env python

import ;

for  +ine in ;8tdin

word B +ine8trip@8p+it@

for  word in word

print `Rt=`  @word8+ower@

#educe5cript.p;

#!/usr/bin/env python

import ;

@+atke;D +atcount B @<oneD >@ke;D count A @<oneD >for  +ine in ;8tdin

@ke;D count B +ine8trip@8p+it@`Rt`if  +atke; and +atke; B ke;

print `Rtd`  @ke;D count@+atke;D +atcount B @ke;D int@count

else+atke; B ke;

+atcount >B int@count

$xamp+e Hive code

Page 104: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 104/115

© www.BitBootCamp.com© www.BitBootCamp.com =>'

3#O* @

  3#O* doc

  5$4$CT T#-<53O#* @+ine

U5I<( Smapper5cript.p;\

 -5 wordD count

  C4U5T$# BE word

wcI<5$#T O1$#W#IT$ T-B4$ wordcount

5$4$CT T#-<53O#* @wc.wordD wc.count

U5I<( reducercript.p;\

 -5 wordD count

M

Advanced Hive

Page 105: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 105/115

© www.BitBootCamp.com© www.BitBootCamp.com =>F

Hive 1aria!+e

Hive Command 4ine Interface

Thrift and Hive

Tranform

Uer defined function

5er)e

Uer )efined 3unction @U)3

Page 106: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 106/115

© www.BitBootCamp.com© www.BitBootCamp.com =>

Uer )efined function can !e ued the ame wa; a td.function

U)3 are written in 0-1- $xtend the c+a with U)3 c+a

5hou+d contain a eva+uate method

Three t;pe of U)3 are upported 5tandard U)3

Uer )efined Ta!+e 3unction @ U)T 5ing+e input row to make into mu+tip+e output #ow

 J *apper  Uer )efined -ggregate function @ U)-3

 -ggregate mu+tip+e va+ue to one one va+ue J #educer 

Cutom U)3

Page 107: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 107/115

© www.BitBootCamp.com© www.BitBootCamp.com =>/

,ackage com.u+f!erht+a!.hive.udfM

Import org.apache.hadoop.hive."+.exec.U)3M

Import org.apache.hadoop.io.TextM

,u!+ic fina+ c+a Upper extend U)3

,u!+ic Text eva+uate @ fina+ Text

If@ AA nu++ return nu++ MP

return new Text @ [email protected]@ M

P

P

)ep+o; the 0ar 3i+e

Page 108: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 108/115

© www.BitBootCamp.com© www.BitBootCamp.com =>

Compi+e the code in the 0ar 3i+e

Hive : add 0ar m;code.&ar 

 -dded m;code.&ar to c+a path

Hive : +it &arM

m;code.&ar 

#egiter the function

Hive: create temporar; function m;upper a

Scom.u+f!erht+a!.hive.udf.UpperM

Ca++ing the U)3 function

Page 109: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 109/115

© www.BitBootCamp.com© www.BitBootCamp.com =>7

Hive: e+ect m;upper@wordD um@fre" from ngram

(roup !; m;upper@word

Advanced Hive

Page 110: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 110/115

© www.BitBootCamp.com© www.BitBootCamp.com ==>

Hive 1aria!+e

Hive Command 4ine Interface

Thrift and Hive

Tranform

Uer defined function

5er)e

5eria+i[er G )eeria+i[er @ 5er)e

Page 111: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 111/115

© www.BitBootCamp.com© www.BitBootCamp.com ===

There i no re"uirement for data to !e in Hive3ormat

)ata i not verified on the inert or +oad )ata i &ut moved a !it

3i+e are imp+; tored

Thi a++ow for ver; fat movement of data

$rror are dicovered when we "uer; the data

)ata ma; not !e in tandard format 4ogD untructured dataD etc..

Hive ued 5er)e to contro+D how to read and write the

fi+e

5er)e

Page 112: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 112/115

© www.BitBootCamp.com© www.BitBootCamp.com ==2

)efau+t 4a[;5imp+e5er)e ,are the data !aed on the de+imiter into t;ped o!&ect

Ue +a[; creation of o!&ect for !etter performace

#egex5er)e Ue regu+ar expre to pare the fi+e

Uer can create cutom 5er)e #ead the !inar; fi+e etc..

The proce to dep+o; erde i ame a U)3

$xamp+e

Page 113: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 113/115

© www.BitBootCamp.com© www.BitBootCamp.com ==%

Com!ine the data from three cv to two co+umn

5amp+e )ata movieidD ;earD tit+e P

=2%'D 2>='D The god 3ather 

%'%FD 2>='D (ravit;..

Create ta!+e movie @ movieid tringD detai+ tring

#ow format erdeSorg.apache.hadoop.hive.contri!.erde2.#egex5er)e

With erdepropertie@ Kinput.regexL A K@RRd?D@.?L M

Appendi+

Page 114: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 114/115

© www.BitBootCamp.com© www.BitBootCamp.com =='

#ow<um!er U)3

U)3#ow5e"uence

Page 115: 4 Hive Tutorial d03

7/21/2019 4 Hive Tutorial d03

http://slidepdf.com/reader/full/4-hive-tutorial-d03 115/115

package org.apache.hadoop.hive.contri!.udfM

import org.apache.hadoop.hive."+.exec.)ecriptionM

import org.apache.hadoop.hive."+.exec.U)3M

import org.apache.hadoop.hive."+.udf.U)3T;peMimport org.apache.hadoop.io.4ongWrita!+eM

G?? ? U)3#ow5e"uence. ?G

Y)ecription@name A ̀ rowe"uence`D

  va+ue A ̀ 3U<C@ 9 #eturn a generated row e"uence num!er tarting from =`

YU)3T;pe@determinitic A fa+eD tatefu+ A true

pu!+ic c+a U)3#ow5e"uence extend U)3

  private 4ongWrita!+e reu+t A new 4ongWrita!+e@M

  pu!+ic U)3#ow5e"uence@

  reu+t.et@>M

  P

  pu!+ic 4ongWrita!+e eva+uate@

  reu+t.et@reu+t.get@ ] =M

  return reu+tM  P

P

GG $nd U)3#ow5e"uence.&ava