of bloomberg data systems hbase at bloomberg · hbase at bloomberg // the evolution of bloomberg...
TRANSCRIPT
![Page 1: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/1.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
THE EVOLUTION OF BLOOMBERG DATA SYSTEMS
MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015
![Page 2: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/2.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
BLOOMBERG 2
Leading Data and Analytics provider to the financial industry
![Page 3: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/3.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
DATA IS OUR BUSINESS 3
![Page 4: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/4.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
September 28: Full Workshop at Bloomberg September 30: Showcase at Strata Hadoop Call for papers at: bloomberglabs.com/data-science
DATA FOR GOOD EXCHANGE: GOVERNMENT INNOVATION, PUBLIC HEALTH, ENVIRONMENT, EDUCATION
![Page 5: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/5.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
5
• We have a “medium data” problem…
• Speed and availability are paramount
• Hundreds of thousands of users with expensive requests
We’ve built many systems to address
DATA MANAGEMENT TODAY
![Page 6: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/6.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
DATA MANAGEMENT CHALLENGES 6
• Single security analytics on Big Iron
• Replication of Systems and Data
• Complexity kills
Top 500 Supercomputer list, 2013
>96% Linux. 100% of top 40.
![Page 7: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/7.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
DATA MANAGEMENT TOMORROW 7
• Simplicity and performance
• Benefit from external developments
• Retain our independence
• Details matter
![Page 8: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/8.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
THE PREMISE 8
• Can apply big data techniques to our medium data problem, by addressing gaps in existing open systems
• HBase is a good bet • Part of a broader whole • The Biggest community wins
![Page 9: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/9.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
CHALLENGES
Our requirements from HBase are: • Read performance – fast with low variability • High availability • Operational simplicity • Efficient use of good hardware • Expressive power
Bloomberg has been investing in all these aspects of HBase
![Page 10: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/10.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
WE’VE MADE THAT BET 10
![Page 11: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/11.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
WE’RE NOT THE ONLY ONES 11
Google Cloud Bigtable
![Page 12: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/12.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
AIMING HIGHER
We can make things better by working together
Let’s be the gold standard
![Page 13: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/13.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
13
![Page 14: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/14.jpg)
>>>>>>>>>>>>>> CALL TO ACTION
![Page 15: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/15.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
FURTHER BOLSTER RELIABILITY 15
Great strides such as HBASE-10070 but more to do
• Improved reconciliation of state between Master, META and ZK
• More determinism in Admin/Master operations
![Page 16: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/16.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
BENEFIT FROM MODERN HARDWARE 16
• 32 cores - 256GB RAM – SSD - untapped potential
• CPU load max 20% , inadequate throughput
• Multi-RS administratively painful
• Much better story with memory
![Page 17: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/17.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
IMPROVE MULTI-TENANCY 17
• Mixed workloads challenging • interactive vs batch • read vs write • different read access
patterns
• Many solutions in progress
• Administrative simplicity is key
![Page 18: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/18.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
SPARK INTEGRATION 18
• Analytical frameworks need a distributed database
• Columnar file format != column database
• Integrate with HBase to move towards the universal database
![Page 19: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/19.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
ANALYTICS: EFFICIENCY 19
• Choice of row and columnar storage engines
• Expose primitives for efficiency: • Column pruning • Predicate pushdowns • Data locality
![Page 20: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/20.jpg)
HB
ASE
AT B
LOO
MB
ERG
//
THE FUTURE IS BRIGHT 20
• The state of the “Hadoop Database” union is strong – Increasing adoption – Strong foundation – Great community
• Prominent role in the data & analytics platform of the future
• Let’s go create the future
![Page 21: OF BLOOMBERG DATA SYSTEMS HBASE AT BLOOMBERG · HBASE AT BLOOMBERG // THE EVOLUTION OF BLOOMBERG DATA SYSTEMS MEDIUM DATA NEEDS FOR THE FINANCIAL INDUSTRY MAY // 07 // 2015 . HBASE](https://reader034.vdocuments.site/reader034/viewer/2022052215/5f0f74c47e708231d4444045/html5/thumbnails/21.jpg)
>>>>>>>>>>>>>> THANK YOU