avatara: olap for web-scale analytics products
TRANSCRIPT
Recruiting Solutions Recruiting Solutions Recruiting Solutions 1
Avatara
OLAP for Web-scale Analytics Products
Lili Wu, Roshan Sumbaly, Chris Riccomini, Gordon Koo, Hyung Jin Kim, Jay Kreps, Sam Shah
http://www.linkedin.com/in/liliwu [email protected]
2
World’s largest professional social network
175+ million member in August 2012
2 new member per second
26th most visited web site in June 2012*
* Based on comScore, Q2 2012
About LinkedIn
3
Structured
User Data Industry
Country
4
Activity
Data View profile
Apply Job
Structured
User Data Industry
Country
5
= Analytical Insights
Activity
Data
Structured
User Data +
6
Who Viewed
My Profile
(WVMP)
7
8
9
10
11
12
If only have Member and Industry attributes
Computer Software
Recruiting & Staffing
Mobile Internet
Alice 260 152 293
Bob 233 186 121
Industry
Member
13
If include country attribute
Member
14
If include country attribute
Member If add viewing time…
15
OLAP
16
OLAP Online Analytical Processing
is an approach to quickly answer multi-dimensional analytical queries.
17
OLAP Cube
17
Store data in Multi-dimensional form
18
OLAP Cube
18
Member
Dimensions
19
OLAP Cube
19
Member
Dimensions
Measure 5
20
Our challenge: Web-scale OLAP
21
• Horizontally scalable 175+ million members
Adding 2 new members / second
Our challenge: Web-scale OLAP
22
• Horizontally scalable
• Low latency query In request/response loop
Tens of milliseconds
Our challenge: Web-scale OLAP
23
• Horizontally scalable
• Low latency query
• Highly available 26th most visited web site
Our challenge: Web-scale OLAP
24
• Horizontally scalable
• Low latency query
• Highly available
• High read & write throughput Billions of monthly page views
Our challenge: Web-scale OLAP
25
• Traditional OLAP
• Distributed OLAP
• Materialize Cubes
Our Options …
26
• SAP
• Oracle Hyperion
• MicroStrategy
• …
Traditional OLAP For Business Intelligence
offline analysis
27
• Few concurrent users
• High latency for web traffic
For Business Intelligence
Traditional OLAP
28
• Few concurrent users
• High latency for web traffic
For Business Intelligence
Not well-suited for web-scale online traffic
Traditional OLAP
29
Distributed OLAP Query Result
Query Distribution and Processing Layer
30
Distributed OLAP
Query Distribution and Processing Layer
Query Result
31
Distributed OLAP
Query Distribution and Processing Layer
Query Result
32
Distributed OLAP
Query Distribution and Processing Layer
Query Result
33
Distributed OLAP
Query Distribution and Processing Layer
Query Result
34
Materialize: Pre-compute all combinations
Materialize Cubes
Combination Count {Alice} 55 {Alice, Internet} 21 {Alice, Recruiting} 22 {Alice, Internet, U.S.} 10 {Alice, Internet, Canada} 11 … {Bob} 60 {Bob, Internet} 34 …
{ Member }
{ Member, Industry }
{ Member, Industry,
Country }
35
• Materialize : requires more space & time 175 million members, average 10 industry, average 5 countries, 90 days
175 million + 175 million x 10 industry + 175 million x 5 countries + 175 million x 90 days + 175 million x 10 industry x 5 countries + 175 million x 10 industry x 90 days + 175 million x 5 countries x 90 days + 175 million x 10 industry x 5 country x 90 days + … ~ 1 trillion keys
Materialize Cubes
36
• Materialize : requires more space & time 175 million members, average 10 industry, average 5 countries, 90 days
175 million + 175 million x 10 industry + 175 million x 5 countries + 175 million x 90 days + 175 million x 10 industry x 5 countries + 175 million x 10 industry x 90 days + 175 million x 5 countries x 90 days + 175 million x 10 industry x 5 country x 90 days + … ~ 1 trillion keys
Each profile view is turned into 8 writes. Billion page views … load is too high.
Materialize Cubes
37
Traditional OLAP
Distributed OLAP Materialize Cubes
38
Traditional OLAP
Distributed OLAP Materialize Cubes
39
Our use cases…
40
Data can be sharded
Our use cases…
41
Data can be sharded By member id for “Who Viewed My Profile”
Our use cases…
42
Data can be sharded By member id for “Who Viewed My Profile” Data size per shard is small ( < 2MB )
Our use cases…
43
Data can be sharded By member id for “Who Viewed My Profile” Data size per shard is small ( < 2MB ) Goal: single disk I/O
Our use cases…
44
Data can be sharded By member id for “Who Viewed My Profile” Data size per shard is small ( < 2MB ) Goal: single disk I/O Many Small Cubes
Our use cases…
45
Data can be sharded By member id for “Who Viewed My Profile” Data size per shard is small ( < 2MB ) Goal: single disk I/O Many Small Cubes
Can tolerate some data staleness Within a few hours
Our use cases…
46
Our challenges Our use cases
47
Our challenges Our use cases +
Avatara
48
Avatara
OLAP for Web-scale Analytical Products
49
Used in production for 2+ years Powers several analytical products
Avatara
OLAP for Web-scale Analytical Products
50
Agenda Architecture Related work Conclusion
51
An OLAP system: • Compute cubes • Serve queries
Avatara Architecture
52
An OLAP system: • Compute cubes • Serve queries
Avatara Architecture
Together
53
• Offline: Compute cubes Offline Offline
• Online: Serve queries
Avatara Architecture
54
• Offline: Compute cubes Goal: high throughput Batch processing (Hadoop)
• Online: Serve queries
Avatara Architecture
55
• Offline: Compute cubes Goal: high throughput Batch processing (Hadoop)
• Online: Serve queries Goal: low latency, high availability Key-value store (Voldemort)
Avatara Architecture
56 56 56
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
Avatara Architecture
57 57 57
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
Avatara Architecture
58 58 58
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
Avatara Overview
59 59 59
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
Avatara Overview
60 60 60
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Storage
Projection +
Join Cubification Key-value
storage
Avatara Overview
61 61 61
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
Avatara Overview
62 62 62
Activity data
Offline Batch Engine
Preprocessing
Site
Key-value storage
Storage
Projection +
Join Cubification
Offline Batch Engine
Online Query Engine
63 63 63
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
Offline Batch Engine
Controlled by a configuration file
64
Phase 1 : Preprocessing
input.profile_views = /profile_views !input.member_info = /member_info !
Preprocessing Projection
+ Join
Cubification
65
Phase 2 : Projection + Join
Preprocessing Projection
+ Join
Cubification
dimensions = !member_info.member_id, !member_info.industry, !member_info.country !
facts = !"profile_views.viewee_id, !"profile_views.viewer_id, !"profile_views.time !
measure = profile_views.visit !
join = !"profile_views.viewer_id, !"member_info.member_id !
66
dimensions = !member_info.member_id, !member_info.industry, !member_info.country !
facts = !"profile_views.viewee_id, !"profile_views.viewer_id, !"profile_views.time !
measure = profile_views.visit !
join = !"profile_views.viewer_id, !"member_info.member_id !
Preprocessing Projection
+ Join
Cubification
Phase 2 : Projection + Join
67
dimensions = !member_info.member_id, !member_info.industry, !member_info.country !
facts = !"profile_views.viewee_id, !"profile_views.viewer_id, !"profile_views.time !
measure = profile_views.visit !
join = !"profile_views.viewer_id, !"member_info.member_id !
Preprocessing Projection
+ Join
Cubification
Phase 2 : Projection + Join
68
dimensions = !member_info.member_id, !member_info.industry, !member_info.country !
facts = !"profile_views.viewee_id, !"profile_views.viewer_id, !"profile_views.time !
measure = profile_views.visit !
join = !"profile_views.viewer_id, !"member_info.member_id !
Preprocessing Projection
+ Join
Cubification
Phase 2 : Projection + Join
69
dimensions = !member_info.member_id, !member_info.industry, !member_info.country !
facts = !"profile_views.viewee_id, !"profile_views.viewer_id, !"profile_views.time !
measure = profile_views.visit !
join = !"profile_views.viewer_id, !"member_info.member_id !
Preprocessing Projection
+ Join
Cubification
Phase 2 : Projection + Join
70
Phase 3 : Cubification
cube.name = wvmp-cube-profile-views !cube.shard_key = profile_views.viewee_id !
Preprocessing Projection
+ Join
Cubification
71
Configuration File input.profile_views = /profile_views !input.member_info = /member_info !
dimensions = !member_info.member_id, !member_info.industry, !member_info.country !
facts = !"profile_views.viewee_id, !"profile_views.viewer_id, !"profile_views.time !
measure = profile_views.visit !
join = !"profile_views.viewer_id, !"member_info.member_id !
cube.name = wvmp-cube-profile-views !cube.shard_key = profile_views.viewee_id !
72
Blob Format
Key Value
Alice
...
73 73 73
Offline Batch Engine Recap
Activity data
Offline Batch Engine
Preprocessing
Site
Key-value storage
Storage
Projection +
Join Cubification
Online Query Engine
Controlled by a configuration file
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Storage
Projection +
Join Cubification
74
Key-value storage
Online Query Engine
75
Alice
Bob
Key Value
...
Key-value Storage
Bulk Load*
* R. Sumbaly, J. Kreps, L. Gao, A. Feinberg, C. Soman, and S. Shah. Serving Large-scale Batch Computed Data with Project Voldemort. In FAST, pages 223–235, 2012.
Online Query Engine
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Storage
Projection +
Join Cubification Key-value
Storage
76
77
Alice
...
Key-value Storage
Alice
Online Query Engine
Bob
Single disk I/O
Online Query Engine
Select Where Group-by Having Order Limit Count / Percent / Sum / Average
78
79
Online Query Engine
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views")
80
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id)
81
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit")
82
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit") .select("member_info.industry")
83
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit") .select("member_info.industry") .group("member_info.industry")
84
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit") .select("member_info.industry") .group("member_info.industry") .sum("visit")
85
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit") .select("member_info.industry") .group("member_info.industry") .sum("visit") .order("visit", "desc")
86
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit") .select("member_info.industry") .group("member_info.industry") .sum("visit") .order("visit", "desc") .limit(10)
87
Online Query Engine
AvataraQuery query = new AvataraSqlishBuilder() .setCube("wvmp-cube-profile-views") .setShardKey($member_id) .select("visit") .select("member_info.industry") .group("member_info.industry") .sum("visit") .order("visit", "desc") .limit(10) .build();
AvataraResult result = queryEngine.getCube(query);
88
89
Online Query Engine
Online Query Engine
90
Cube Thinning
91
Heavy Hitters Roll up data to coarse granularity Drop data from a dimension
Predicate Push-Down
92
Key-value storage nodes: I/O-bound Our data: one blob Computation done on storage nodes Decrease data transfer
Avatara Architecture Recap Avatara Architecture
Activity data
Offline Batch Engine
Preprocessing
Online Query Engine
Site
Key-value storage
Storage
Projection +
Join Cubification
93
94
Agenda Architecture Related Work Conclusion
95
Distributed OLAP Scatter-gather
Related Work
96
Distributed OLAP MR Cube [Nandi11]
Materialize cubes for holistic measures (median, distinct)
Utilizes MapReduce No query engine
Related Work
97
Distributed OLAP MR Cube Key-value store
Amazon Dynamo [DeCandia07] Yahoo PNUTS [Cooper08]
Related Work
98
Agenda Architecture Related Work Conclusion
99
In production for 2+ years Powers several analytical products Hadoop + Voldemort Cost-effective: commodity hardware Horizontally scalable
Experiences
100
Near real-time cubing Scaling read while high write throughput Streaming joins Multitenant issues Dimension and schema changes
Future work
101
Problem Web scale OLAP: high throughput, low
latency, high availability
Insight Many small cubes: sharded by key
Solution Mix of batch computation and online
serving
Conclusion
102
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!’s Hosted Data Serving Platform. PVLDB, 1(2):1277–1288, 2008.
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon’s Highly Available Key-Value Store. SIGOPS Operating Systems Review, 41(6):205–220, 2007.
A. Nandi, C. Yu, P. Bohannon, and R. Ramakrishnan. Distributed Cube Materialization Holistic Measures. In ICDE, pages 183–194, 2011.
R. Sumbaly, J. Kreps, L. Gao, A. Feinberg, C. Soman, S. Shah. Serving Large-scale Batch Computed Data with Project Voldemort. In FAST, pages 223–235, 2012.
Selected Bibliography
103
Questions?
Thank you !
104
Problem Web scale OLAP: high throughput, low
latency, high availability
Insight Many small cubes: sharded by key
Solution Mix of batch computation and online
serving
Thank You !