hbase - aioug · 38 hbase delete when delete command is triggered actual data is not deleted a...

40
HBASE

Upload: others

Post on 08-Sep-2019

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

HBASE

Page 2: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

V.Hariharaputhran

o Fourteen years in Oracle Development / DBA / Big Data / Cloud Technologies

o All India Oracle Users Group (AIOUG) Evangelist

o Passion to learn and share

o Blog: www.puthranv.com

Page 3: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

Harish P o Eight Plus years in Oracle DBA

o Big Data / Cloud Technologies/ RAC

Specialist

o All India Oracle Users Group (AIOUG)

Evangelist

o Passion to learn and share

Page 4: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

Agenda • Big Data Introduction

• Hadoop Components

• Hbase Overview

• Hbase in Hadoop

• Why Hbase

• Hbase Architecture

• Hbase Read and Write

Page 5: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

5

Data Data Data…Lots of Data

Twitter

Facebook

Google keeps track of you

World Population

Banking/Telecom/Energy…every industry contribute

No Data Archiving Logic

Iam always online

Page 6: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

6

Internet of People to Internet of Things

QUALITY &

CONSISTENCY MAINTAIN & REPAIR SMART SHOPPING MONITOR POLLUTION

LEVELS

WILDLIFE PROTECTION FARMING ENERGY

Devices TALK to each other as they become SMART & generate DATA

Page 7: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

7

Hadoop Components

Page 8: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

8

Hadoop Components

HDFS – Distributed File system

MapReduce – Distributed Data Processing

Model

Hive – Provides SQL-Based Query Language

HBASE – Distributed column-based database

Pig – Data Flow Execution

Page 9: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

9

HDFS - Daemon / Background Process

Data Node(DN)

Secondary

Name Node(SNN)

Name Node (NN)

DN4

DN1 DN2 DN3

NN SNN

Page 10: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

10

MapReduce - Daemon / Background Process

Task Tracker

Job Tracker

DN1 DN2 DN3

NN SNN

Page 11: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

11

Hbase – Daemon / Background Process

Region Server

Hbase Master

RS1 RS2 RS3

HM SNN

Page 12: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

12

SQL vs NoSQL

EMPID NAME SALARY

100 Karthick 50000

101 Shiva 40000 Row Column

100 CF – Name Timestamp value = Karthick

100 CF – Salary Timestamp value = 50000

101 CF – Name Timestamp value = Shiva

101 CF – Salary Timestamp value = 40000

EMPID NAME SALARY CITY

100 Karthick 50000 CHENNAI

101 Shiva 40000

100 CF – City Timestamp value = Chennai

EMPID NAME SALARY CITY

100 Karthick 50000 DELHI

101 Shiva 40000 100 CF – City Timestamp value = Delhi

Page 13: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

13

No SQL Databases

NO SQL

Document

databases

Key-value

stores

Wide-column

stores

Page 14: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

14

Hbase Keys & Column Families

Rowkey

100

101

Personal Data

Name Address

Tom SFO

Mike SFO

Demographic

DOB Gender

01-01-1960 M

01-01-1970 M

Each row has a Key

Each record is divided into Column Families

Each column family consists of one or more Columns

Page 15: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

15

Hbase Overview

•Scalable, distributed data store

•Open source avatar of Google’s Bigtable

•Sparse

•Tightly integrated with Hadoop

•Not a RDBMS

Page 16: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

16

Hbase is

• Column family oriented database

• Column family oriented

• Tables consisting of rows and columns

• Persisted Map

• Sparse

• Multi dimensional

• Sorted

• Indexed by rowkey, column and timestamp

• Key Value store

• [rowkey, col family, col qualifier, timestamp] -> cell value

Page 17: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

17

Hbase is not..

• A relational database

• No SQL query language

• No joins

• No secondary indexing

• No transactions

Page 18: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

18

When to use Hbase

•Data volume

•Application Types

•Hardware environment

•No requirement of relational features

•Quick access to data

Page 19: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

19

Hbase Features

•Scalability

•Sharding

•Distributed storage

•Failover support

•API support

•MapReduce support

•Back up support

Page 20: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

20

Hbase Vs RDBMs

Page 21: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

21

Hbase Shell

bin/hbase shell

• Create table

•create ‘mytable’ , ‘cf1’

• List tables

• list

• Describe table

• describe ‘mytable’

Page 22: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

22

Hbase Shell Cont

• Put a row

• put ‘mytable’ , ‘row1’, ‘cf1:cq1’ , ‘val1’

• Get a row

• get ‘mytable’ , ‘row1’

• Put more

• put ‘mytable’ , ‘row2’ , ‘cf1:cq1’ , ‘val2’

• put ‘mytable’ , ‘row1’ , ‘cf1:cq2’ , ‘val3’

• Get a row

• get ‘mytable’ , ‘row1’

• Scan table

• scan ‘mytable’

Page 23: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

23

Demo

Page 24: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

24

Hbase – Column Families Cont

Rowkey ColumnFamily Column Timestamp Value

1

CF1 COL1 123 INDIA

COL1 124 27

COL2 126 AIOUG

COL2 127 NI

CF2 COL3 123 12.6

COL3 128 ORACLE

Key Value Pair

Row Key CF1 CF2

COL1 COL2 COL3

1 INDIA 12.6

1 27

1 AIOUG

1 NI

1 ORACLE

Timestamp

123

124

126

127

128

Row Format

Page 25: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

25

Hbase Read and Write

Page 26: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

26

Hbase Catalog Tables

Keeps Track where

.META FILE is

present Keeps Track of All Table,

Regions that are present

Page 27: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

27

Meta Table

Page 28: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

28

Table - TBL

Hbase – Region and Region Servers

a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

p

Region1

Region2

Region3

Region4

Table TBL,Region 1

Table TBL,Region 2

Table TBL,Region 3

Table T, Region 240

Table TBL,Region 4

Table A,Region 500

Region Server - RS1210

Region Server - RS 1230

Region Server - RS1260

Page 29: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

29

A table can be divided horizontally into one or more regions. A region

contains a contiguous, sorted range of rows between a start key and an end

key

Each region is 1GB in size

A region of a table is served to the client by a RegionServer

Hbase Region

Page 30: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

30

Hbase Client – Locate Data

Page 31: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

31

Client

Region

Server

Region

Server

Zookeper

META

DATA

DATA NODE DATA NODE

Hbase Client – Read / Locate Data

META Location

META

Cache

Page 32: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

32

Where does your data Reside ?

Page 33: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

33

Hbase Region Server Components

Page 34: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

34

Hbase Write

Page 35: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

35

WAL

Hbase Write

100

1

50

Client HMaster

Region Server 102

Region Server 102

Memstore 100

1

50

HFile

ACK

Page 36: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

36

How Data is Stored in Hfile

Page 37: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

37

Demo

Page 38: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

38

Hbase Delete

When Delete command is triggered actual data is not deleted

A tombstone marker is set

HBase periodically removes deleted cells during compactions.

Tombstone Marker

- > Version delete marker

Marks a single version of a column for deletion

-> Column delete marker

Marks all versions of a column for deletion

-> Family delete marker

Marks all versions of all columns for a column family for deletion

Page 39: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

39

Page 40: HBASE - AIOUG · 38 Hbase Delete When Delete command is triggered actual data is not deleted A tombstone marker is set HBase periodically removes deleted cells during compactions

40