Transcript

You’ve got HBaseHow AOL Mail handles Big Data

May 22, 2012Presented at HBaseCon

Presented atHBaseCon 2012

Page 2

The AOL Mail SystemOver 15 years old

Constantly evolving

10,000+ hosts

70+ Million mailboxes

50+ Billion emails

A technology stack that runs the gamut

Presented atHBaseCon 2012

Page 3

What that means…Lots of data

Lots of moving parts

Tight SLAs

Mature system + Young software = Tough marriageWe don’t buy “commodity” hardware

Engrained Dev/QA/Prod product lifecycle

Somewhat “version locked” to tried-and-true platforms

Expect service outages to be quickly mitigated by our NOC w/out waiting for an on-call

Presented atHBaseCon 2012

Page 4

So where does HBase fit?It’s a component, not the foundation

Currently used in two places

Being evaluated for moreIt will remain a tool in our diverse Big Data arsenal

An Activity Profiler

Presented atHBaseCon 2012

Page 6

An “Activity Profiler”Watches for particular behaviors

Designed and built in 6/2010

Originally “vanilla” Hadoop 0.20.2 + HBase 0.90.2

Currently CDH3

1.4+ Million Events/min

60x 24TB (raw) DataNodes w/ local RegionServers

15x application hosts

Is an internal-only toolUsed by automated anti-abuse systems

Leveraged by data analysts for adhoc queries/MapRed

Presented atHBaseCon 2012

Page 7

An “Activity Profiler”

Presented atHBaseCon 2012

Page 8

Why the “Event Catcher” layer?Has to “speak the language” of our existing systems

Easy to plug an HBase translator in to existing data feeds

Hard to modify the infrastructure to speak HBase

Flume was too young at the time

Presented atHBaseCon 2012

Page 9

Why batch load via MapRed?Real time is not currently a requirement

Allows filtering at different points

Allows us to “trigger” eventsDesigned before coprocessors

Early data integrity issues necessitated “replaying”Missing append support early on

Holes in the Meta table

Long splits and GC pauses caused client timeouts

Can sample data into a “sandbox” for job development

Makes pig, hive, and other MapRed easy and stableWe keep the raw data around as well

Presented atHBaseCon 2012

Page 10

HBase and MapRed can live in harmonyBigger than “average” hardware

36+GB RAM

8+ cores

Proper system tuning is essentialGood information on tuning Hadoop is prolific, but…

XFS > EXT

JBOD > RAID

As far as HBase is concerned…

Just go buy Lars’ book

Careful job development, optimization is key!

Contact History API

Presented atHBaseCon 2012

Page 12

Contact History API Services a member-facing API

Designed and built in 10/2010

Modeled after the previous applicationBuilt by a different Engineering team

Used to solve a very different problem

250K+ Inserts/min

3+ Million Inserts/min during MapRed

20x 24TB (raw) DataNodes w/ local RegionServers

14x application hosts

Leverages Memcached to reduce query load on HBase

Presented atHBaseCon 2012

Page 13

Contact History API

Where we go from here

Presented atHBaseCon 2012

Page 15

Amusing mistakes to learn fromExploding regions

Batch inserts via MapRed result in fast, symmetrical key space growth

Attempting to split every region at the same time is a bad idea

Turning off region splitting and using a custom “rolling region splitter” is a good idea

Take time and load into consideration when selecting regions to split

Backups, backups, backups!You can never have to many

Large, non-splitable regions tell you thingsOur key space maps to accounts

Excessively large keys equal excessively “active” accounts

Presented atHBaseCon 2012

Page 16

Next-generation model

Presented atHBaseCon 2012

Page 17

Thanks!


Top Related