apache hadoop india summit 2011 keynote talk "hadoop & the future of cloud computing"...

Post on 06-May-2015

3.245 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Todd Papaioannou VP, Cloud Architecture

By SearchNetMedia

HADOOP & THE FUTURE OF CLOUD

COMPUTING

HAPPENING WHAT’S

More publicly available human-generated content

More interactions being tracked (e.g. clickstream data)

More business processes are being digitized

More history being kept

= The Data Exhaust!

Flickr : sub_lime79BigData is here!

THE NOISECUTTING THROUGH

Flickr : Lomo-Cam

LocationSocial

Relationships

ScienceUnderstandingUser Interests

access audience blogs communication

computer internet mass media

people networking technology

INTO INSIGHTSTURNING DATA

machine learningtime series

content clustering

factorization models

logic regression

Flickr : NASA Goddard Photo and Video

algorithmsuser interest prediction

Ad inventory modeling

RELEVANTMAKING IT

Flickr : ogimogi

LIGHTNING-FASTHADOOP:

science + big data + insight = personal relevance = VALUE

TECHNOLOGY

Flickr : DDFic

EVERY CLICKBEHIND

HADOOP

Flickr : Got Sarah

THE PLATFORM EFFECTTHE HADOOP ECOSYSTEM

and other Early AdoptersScale and productize Hadoop

9

Apache Hadoop

Orgs with Internet Scale ProblemsAdd tools / frameworks, enhance Hadoop

Mainstream / Enterprise adoptionFund further development, enhancements

EnhanceHadoopEcosystem

Service Providers Grow ecosystem - Training, support, enhancements

Virtuous Circle!• Investment -> Adoption• Adoption -> Investment

11

HADOOP ATYAHOO!

“Where Science meets Data”

HADOOP CLUSTERSTens of thousands of servers

DATA PIPELINES

CONTENT

DIMENSIONAL DATA

PRODUCTS

APPLIED SCIENCE

Data Analytics Content OptimizationContent Enrichment Yahoo! Mail Anti-Spam Advertising ProductsAd Optimization Ad SelectionBig Data Processing & ETL

User Interest Prediction Ad inventory prediction Machine learning - search ranking Machine learning - ad targetingMachine learning - spam filtering

2006 2007 2008 2009 201012

FROM PROJECT TOCORE PLATFORM

Today

38K Servers

170 PB Storage

1M+ Monthly Jobs

Tho

usan

ds o

f Ser

vers

Pet

abyt

es

90

80

70

60

50

40

30

20

10

0

250

200

150

100

50

0

Research

Science Impact

Daily Production

“Behind every click”

13

YAHOO!’S VISIONOPEN SOURCE CLOUD

Open Source Benefits

» Avoid technological dead ends

» Leverage community contributions

» Workforce already trained

Ongoing contributions Yahoo!’s adoption of open source

Future contributions

Cloud serving

Storage

FUTURE HOLD?WHAT DOES THE

By Elsie

MORE BIG

By BionicTeaching

DATA IN THECLOUD

By Fadilfb

PRIVATE CLOUDS

By Zachstern

HYBRID CLOUDS

By Calop

AUTOMATION

CLOUD FABRICS

QUESTIONS?

top related