Transcript
Page 1: October 2013 HUG: HBase 0.96

0.96.0

Bay Area Hadoop User Group, October 16th, 2013

Page 2: October 2013 HUG: HBase 0.96

Michael Stack <[email protected]>

• 0.96.0 Release Manager• Chair of Apache HBase PMC*• Apache Hadoop PMC• Engineer at Cloudera in San Francisco

* Project Management Committee

Page 3: October 2013 HUG: HBase 0.96

HBase?

Page 4: October 2013 HUG: HBase 0.96

"...scalable, distributed datastore."

Page 5: October 2013 HUG: HBase 0.96

"...open source, distributed, scalable, consistent, low latency, random access non-relational database..."

Page 6: October 2013 HUG: HBase 0.96

Inspiration

A Google Technology described in a 2006 paper, by Chang et al.?

Page 7: October 2013 HUG: HBase 0.96

●Apache Top-level Project○hbase.apache.org●Up out of Apache Hadoop contrib●Project goal: “Billions of rows X millions of columns on clusters of ‘commodity hardware”●HBase persists all data to HDFS●Uses Apache ZooKeeper○Cluster coordination

Page 8: October 2013 HUG: HBase 0.96

When would I use it?

Page 9: October 2013 HUG: HBase 0.96

BIG DATA

Random read/writes

Page 10: October 2013 HUG: HBase 0.96

SCALING!

Page 11: October 2013 HUG: HBase 0.96

Who uses it?

Page 12: October 2013 HUG: HBase 0.96
Page 13: October 2013 HUG: HBase 0.96

Who runs the project?

Page 14: October 2013 HUG: HBase 0.96

Diverse team*

* http://hbase.apache.org/team-list.html

COMMITTERS!

Preferably ALIVE!

Page 15: October 2013 HUG: HBase 0.96
Page 16: October 2013 HUG: HBase 0.96

•Release every month• Each more stable•& more performant•Some features…• Wire compatible between releases

•Currently at 0.94.12

Page 17: October 2013 HUG: HBase 0.96

http://www.flickr.com/photos/sysli/3026288256/sizes/o/in/photostream/

Page 18: October 2013 HUG: HBase 0.96
Page 19: October 2013 HUG: HBase 0.96
Page 20: October 2013 HUG: HBase 0.96

(Self-)Migration

Page 21: October 2013 HUG: HBase 0.96

Downstreamers● Minimal API disturbance

–None?–Last-minute feedback

●Hive, Sqoop, OpenTSDB● Deprecations

Page 22: October 2013 HUG: HBase 0.96

Stats● >2k issues fixed

– >1500 in 0.96.x only● Currently 6th Release Candidate● Branched 7months ago● 18months in the making

Page 23: October 2013 HUG: HBase 0.96

Requirements● Hadoop 1.0.3+● Hadoop 2.1.0-beta+● Must choose one

Page 24: October 2013 HUG: HBase 0.96

Big Themes● Stability● Operability

–Insight, tools● Scalability● Evolvability

Page 25: October 2013 HUG: HBase 0.96

http://www.flickr.com/photos/allspaw/5815258929/sizes/o/in/photostream/

Page 26: October 2013 HUG: HBase 0.96

http://www.flickr.com/photos/allspaw/5815258929/sizes/o/in/photostream/

Page 27: October 2013 HUG: HBase 0.96

http://www.flickr.com/photos/38595542@N02/3690830720/sizes/o/in/photostream/

Page 28: October 2013 HUG: HBase 0.96

• Dedicated meta WAL

• Don't put WAL replicas on local node– 33% of reads have to timeout

• Lowered ZK timeout– 30s instead of 180s

• Watcher script kills znode– Detection time approaches 0

• Faster assignment

HBase

Page 29: October 2013 HUG: HBase 0.96

• HDFS-4721 Speed up lease/block recovery when DN fails and a block goes into recovery– Do not recover on STALE DNs

• HDFS-3703 Decrease the datanode failure detection time– Avoid reading STALE DNs

• HDFS-3912 Detecting and avoiding stale datanodes for writing

HDFS

Page 30: October 2013 HUG: HBase 0.96

● Faster WAL replay/Distributed WAL Replay– No intermediate files

● No wait on NN– Committed

● Experimental● Regions online immediately for Writes

– Read older consistent view● “Favored Nodes”

Coming...

Page 31: October 2013 HUG: HBase 0.96
Page 32: October 2013 HUG: HBase 0.96
Page 33: October 2013 HUG: HBase 0.96

One rationale for pb: http://goo.gl/N0HO6n

Page 34: October 2013 HUG: HBase 0.96

• System tables• Filesystem• Up in zookeeper• Over the wire

Page 35: October 2013 HUG: HBase 0.96

RPC• Implements Protobuf Service

●Specification!• Data on the sideoEncodingoCompression

PB DATA

Page 36: October 2013 HUG: HBase 0.96

Scalability• e.g. Replicating 1k to 1k & heading north

• HBASE-8778 Region assigments scan table directory making them slow for huge tables

• HBASE-9208 ReplicationLogCleaner slow at large scale

• HBASE-8877 Reentrant row locks

Page 37: October 2013 HUG: HBase 0.96
Page 38: October 2013 HUG: HBase 0.96

Snapshots• By TableoSnapshot, clone, restore, export

• InexpensiveoJust metadata

• Good for...oBackupsoReplicationoOffline processing

Page 39: October 2013 HUG: HBase 0.96

Integration Tests• Cluster test module

o Standalone or clustero Sizeable

x data x runtime

• "Borrows" test types from all overo Netflix "ChaosMonkey"o Apache Accumulo linked-list dataloss

checkerhbase-it/src/test/java//org/apache/hadoop/hbase/mapreduce/IntegrationTestBulkLoad.java

hbase-it/src/test/java//org/apache/hadoop/hbase/mapreduce/IntegrationTestImportTsv.java

hbase-it/src/test/java//org/apache/hadoop/hbase/mttr/IntegrationTestMTTR.java

hbase-it/src/test/java//org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java

hbase-it/src/test/java//org/apache/hadoop/hbase/test/IntegrationTestLoadAndVerify.java

hbase-it/src/test/java//org/apache/hadoop/hbase/trace/IntegrationTestSendTraceRequests.java

Page 40: October 2013 HUG: HBase 0.96

StochasticLoadBalancer

• Region Count

• Locality

• Movement Cost

• Table Count

• Regions/Table/RegionServer

• Read/Write Counts

• Memstore Size

• Storefile Size

Page 41: October 2013 HUG: HBase 0.96

Tracing• Review HDFS-5274 Add Tracing to HDFS!

Page 42: October 2013 HUG: HBase 0.96

Namespaces• Grouping of tables

– Like database in mysql

• System/User– hbase:meta

• Quota• Coming

– Security by ns– Grouping on cluster by ns

Page 43: October 2013 HUG: HBase 0.96

Metrics2● Radical revamp● Module of Interfaces

–H1 and H2 Impls modules● Categories/Naming/Patterns

Page 44: October 2013 HUG: HBase 0.96

API● Client/Dev● Hadoop Annotations

– Stable/Evolving/Private● Cell Interface

– KeyValue deprecated

Page 45: October 2013 HUG: HBase 0.96
Page 46: October 2013 HUG: HBase 0.96

Miscellaneous• X-Row (in-region) Transactions• Hardened Assignment• Hardened Replication• New UI• Online Merge• Finer grained ACLs• More Coprocessor hooks

Page 47: October 2013 HUG: HBase 0.96

More Misc.• Maven modularized• Client-side Types• Revamped defaults• Compactionso Pluggableo Smarter triggers

• Windows!

Page 48: October 2013 HUG: HBase 0.96

0.96.1, 0.96.2, etc.● Bug fixes● Performance fixes● ONLY!● No features!

Page 49: October 2013 HUG: HBase 0.96
Page 50: October 2013 HUG: HBase 0.96

• Right after 0.96.0– Month or two

• Rolling upgrade from 0.96.0

• In-line Cell-tags• Quota/Groupings• Reverse Scan

Page 51: October 2013 HUG: HBase 0.96

1.0.0?

Page 52: October 2013 HUG: HBase 0.96

Thank [email protected]


Top Related