velox at sf data mining meetup
TRANSCRIPT
Algorithms, Machines, and People
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
GraphX
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
GraphX
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
GraphX
Algorithms, Machines, and People
“deep questions over dirty and heterogenous data”
BERKELEY DATA ANALYTICS STACK (BDAS)
Spark
SparkStreaming Spark SQL
BlinkDBGraphX
MLlib
MLBase
HDFS, S3, … Tachyon
Mesos Hadoop Yarn
Catify: Music for Cats
MODELING TASK
Rating
Songs
MODELING TASK
Ratings
Songs
Prediction
Catify: Music for Cats
Catify: Music for CatsCatID Song Score
1 16 2.1
1 14 3.7
3 273 4.2
4 14 1.9
Catify: Music for Cats
Pipeline
CatID Song Score
1 16 2.1
1 14 3.7
3 273 4.2
4 14 1.9
Catify: Music for Cats
Tachyon + HDFS
Pipeline
CatID Song Score
1 16 2.1
1 14 3.7
3 273 4.2
4 14 1.9
Catify: Music for Cats
Tachyon + HDFS
Pipeline
CatID Song Score
1 16 2.1
1 14 3.7
3 273 4.2
4 14 1.9
Pipeline
Tachyon + HDFS
Node.js App Server
Apache Web Server
Catify: Music for Cats
Catify: Music for Cats
Songs
Users
Songs
Users
O(users * songs)
Catify: Music for Cats
Pipeline
Tachyon + HDFS
Node.js App Server
Apache Web Server
Catify: Music for Cats
Pipeline
Tachyon + HDFS
Node.js App Server
Apache Web Server
PrecomputedRatings
Catify: Music for Cats
Pipeline
Tachyon + HDFS
Node.js App Server
Apache Web Server
PrecomputedRatings
Catify: Music for Cats
Black box
Pipeline
Tachyon + HDFS
Node.js App Server
Apache Web Server
Training Data
PrecomputedRatings
Catify: Music for Cats
Black box
Pipeline
Tachyon + HDFS
Node.js App Server
Apache Web Server
Training Data
PrecomputedRatings
Catify: Music for Cats
Black box
What’s wrong?
1. Serving system: low-latency but high staleness
What’s wrong?
1. Serving system: low-latency but high staleness
2. Batch training: slow incremental maintenance, no serving
What’s wrong?
1. Serving system: low-latency but high staleness
2. Batch training: slow incremental maintenance, no serving
3. Ad-hoc model management
What’s wrong?
VELOX GOALS
VELOX GOALS
1. Low latency and fresh predictions
VELOX GOALS
1. Low latency and fresh predictions2. Break the abstraction: model-
specific optimizations
VELOX GOALS
1. Low latency and fresh predictions2. Break the abstraction: model-
specific optimizations3. Unified system eases operation
Spark
SparkStreaming Spark SQL
BlinkDBGraphX
MLlib
MLBase
HDFS, S3, … Tachyon
THE MISSING PIECE IN BDAS
Mesos
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase
Training
THE MISSING PIECE IN BDAS
Spark
HDFS, S3, … Tachyon
Mesos
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase
Training Management + Serving
THE MISSING PIECE IN BDAS
Spark
HDFS, S3, … Tachyon
Mesos
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase VeloxTraining Management + Serving
THE MISSING PIECE IN BDAS
Spark
HDFS, S3, … Tachyon
Mesos
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase VeloxTraining Management + Serving
Spark
HDFS, S3, … Tachyon
THE MISSING PIECE IN BDAS
Mesos
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase VeloxTraining Management + Serving
Spark
HDFS, S3, … Tachyon
ModelManager
THE MISSING PIECE IN BDAS
Mesos
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase VeloxTraining Management + Serving
Spark
HDFS, S3, … Tachyon
ModelManager
PredictionService
THE MISSING PIECE IN BDAS
Mesos
VELOX ARCHITECTURE
VELOX ARCHITECTUREStandalone Scala
Service
VELOX ARCHITECTUREStandalone Scala
Service
Automatic Integration with Spark
VELOX ARCHITECTUREStandalone Scala
Service
Personalized Predictions as a Service
Automatic Integration with Spark
VELOX ARCHITECTUREStandalone Scala
Service
Shared-Nothing Serving Cluster
Personalized Predictions as a Service
Automatic Integration with Spark
SYSTEM ARCHITECTURE
uuid: 01-10
uuid: 11-20
uuid: 20-30
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
SYSTEM ARCHITECTURE
Predictions via RESTfrontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
SYSTEM ARCHITECTURE
Predictions via RESTfrontend.js
Returns score
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
Feedback via REST
uuid: 4
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
Feedback via REST
Model updated
in realtime
uuid: 4
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
SYSTEM ARCHITECTURE
master
workerworker
worker
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
SYSTEM ARCHITECTURE
master
workerworker
worker
Batch train RPC
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
SYSTEM ARCHITECTURE
master
workerworker
worker
Batch train RPC
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30Returns batch trained model
Mesos Mesos
HDFS, S3, … Tachyon
Hadoop Yarn
Spark Straming Shark
SQL
Graph X ML
library
BlinkDB MLbase
Spark
VeloxTraining Management + Serving
PREDICTION SERVICE
ModelManager
PredictionService
PREDICTION API
GET /velox/catify/predict?userid=22&song=27632Simple point queries:
PREDICTION API
GET /velox/catify/predict_top_k?userid=22&k=100
GET /velox/catify/predict?userid=22&song=27632Simple point queries:
More complex ordering queries:
PREDICTION API
GET /velox/catify/predict_top_k?userid=22&k=100
GET /velox/catify/predict?userid=22&song=27632Simple point queries:
More complex ordering queries:
Low-latency andscalable partitioning
Personalized Predictions
PREDICTION API
GET /velox/catify/predict_top_k?userid=22&k=100
GET /velox/catify/predict?userid=22&song=27632Simple point queries:
More complex ordering queries:
Low-latency andscalable partitioning
Personalized Predictions
Intelligent Caching
Sharing and re-use of model partial-state
SYSTEM ARCHITECTURE
Predictions via RESTfrontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
uuid model
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
uuid model
Look up user model
Read
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
uuid model
Look up user model
Primary key lookup
Read
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
uuid model
Look up user model
Primary key lookup
Partition queries by user : always local
Read
Compute Features
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
user independent
}f( )
Compute Features
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
Feature computation could be costly
user independent
}f( )
Compute Features
PREDICTION EXECUTION
def predict( u: UUID, x: Context )
Feature computation could be costly
user independent
}Cache features forreuse across users
f( )
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs Playlist Keywords
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs Playlist Keywords CandidateSongs
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs Playlist Keywords CandidateSongs
Score andrank allcandidates
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs Playlist Keywords CandidateSongs
By exploiting split model design we can leverage:
Score andrank allcandidates
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs Playlist Keywords CandidateSongs
By exploiting split model design we can leverage:
Score andrank allcandidates
A. Shrivastava, P. Li. “Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS).” NIPS’14 Best Paper
TOP-K QUERIESQuery predicate to pre-filter candidate set
All Songs Playlist Keywords CandidateSongs
By exploiting split model design we can leverage:
Score andrank allcandidates
A. Shrivastava, P. Li. “Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS).” NIPS’14 Best Paper
Y. Low and A. X. Zheng. “Fast Top-K Similarity Queries Via Matrix Compression.” CIKM 2012
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
SYSTEM ARCHITECTURE
frontend.js
Returns score
uuid: 01-10
uuid: 11-20
uuid: 20-30
uuid: 4
Mesos Mesos
HDFS, S3, … Tachyon
Hadoop Yarn
Spark Straming Shark
SQL
Graph X ML
library
BlinkDB MLbase
Spark
VeloxTraining Management + Serving
ModelManager
PredictionService
MODEL MANAGER
PERSONALIZED MODELING
PERSONALIZED MODELING
PERSONALIZED MODELING
A Separate Model for Each User?
PERSONALIZED MODELING
Computationally Inefficient many complex models
A Separate Model for Each User?
PERSONALIZED MODELING
Statistically Inefficient not enough data per user
Computationally Inefficient many complex models
A Separate Model for Each User?
Input(Song) Rating
Input(Song) Rating
Input(Song) Rating
Split
Rating
Split
Input(Song)
PERSONALIZED SPLIT MODEL
Input(Song)
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
Changes Slowly
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
Changes SlowlyTrain in Batch!
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
Changes SlowlyTrain in Batch!
PersonalizedUser Model
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
Changes SlowlyTrain in Batch!
Small Data
PersonalizedUser Model
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
Changes SlowlyTrain in Batch!
Small DataChanges Quickly
PersonalizedUser Model
PERSONALIZED SPLIT MODEL
Input(Song)
Shared Basis Feature ModelBig Data
Changes SlowlyTrain in Batch!
Small DataChanges Quickly
Train Online!
PersonalizedUser Model
Input(Song)
PersonalizedUser Model
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
Input(Song)
PersonalizedUser Model
Input(Song)
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
Input(Song)
PersonalizedUser Model
Input(Song)
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
Input(Song)
PersonalizedUser Model
Meow
Input(Song)
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
PersonalizedUser Model
Meow
Input(Song)Input
(Song)
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
PersonalizedUser Model
Meow
Terrible
Input(Song)Input
(Song)
Shared Basis Feature Model
PERSONALIZED SPLIT MODEL
MATHEMATICAL FORMULATION
Input(Song)
MATHEMATICAL FORMULATION
Input(Song)
x
Shared BasisFeature Models
Changes slowly
MATHEMATICAL FORMULATION
Input(Song)
x
Shared BasisFeature Models
Changes slowly
MATHEMATICAL FORMULATION
Input(Song)
f(x; ✓)x
Shared BasisFeature Models
PersonalizedUser Model
Changes slowly
Highly dynamic
MATHEMATICAL FORMULATION
Input(Song)
f(x; ✓)x
Shared BasisFeature Models
PersonalizedUser Model
Changes slowly
Highly dynamic
MATHEMATICAL FORMULATION
Input(Song)
f(x; ✓) ·wu
x
Shared BasisFeature Models
PersonalizedUser Model
Changes slowly
Highly dynamic
= Rating
MATHEMATICAL FORMULATION
Input(Song)
f(x; ✓) ·wu
x
Meow
FEEDBACK API
POST /velox/catify/observe?userid=22&song=27&score=3.7
Simple direct value feedback:
FEEDBACK API
POST /velox/catify/observe?userid=22&song=27&score=3.7
Simple direct value feedback:
Continuously update user models in Velox
Online Learning
FEEDBACK API
POST /velox/catify/observe?userid=22&song=27&score=3.7
Simple direct value feedback:
Continuously update user models in Velox
Online Learning Offline LearningLogged to Tachyon for
feature learning in Spark
FEEDBACK API
POST /velox/catify/observe?userid=22&song=27&score=3.7
Simple direct value feedback:
Continuously update user models in Velox
Online Learning Offline LearningLogged to Tachyon for
feature learning in Spark
EvaluationContinuously assessmodel performance
SYSTEM ARCHITECTURE
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
Feedback via REST
Model updated
in realtime
uuid: 4
ONLINE LEARNING
velox.jar
user model
def observe(u: UUID, x: Context, y: Score)
ONLINE LEARNING
velox.jar
user model
def observe(u: UUID, x: Context, y: Score)
Update user model with new
training data Write
ONLINE LEARNING
velox.jar
user model
def observe(u: UUID, x: Context, y: Score)
Stochastic gradient descent
Update user model with new
training data Write
ONLINE LEARNING
velox.jar
user model
def observe(u: UUID, x: Context, y: Score)
Stochastic gradient descent
Incremental linear algebra
Update user model with new
training data Write
SYSTEM ARCHITECTURE
master
workerworker
worker
Batch train RPC
frontend.js
uuid: 01-10
uuid: 11-20
uuid: 20-30
OFFLINE OR NEARLINE LEARNING
def retrain(trainingData: RDD)
Spark BasedTraining Algs.
wu · f(x; ✓)
Automated retraining policies
Efficient batch training using Spark
Incremental learning using Spark Streaming
Data Model
Data Model
Sample Bias: model affects the training data.
ALWAYS SERVE THE BEST SONG?
Songs
PredictedRating
ALWAYS SERVE THE BEST SONG?
Songs
PredictedRating
VELOX SOLUTION
PredictedRating
Songs
With prob. 1- ϵ serve the best predicted song
VELOX SOLUTION
PredictedRating
Songs
With prob. 1- ϵ serve the best predicted song
PredictedRating
Songs
With prob. 1- ϵ serve the best predicted songWith prob. ϵ pick a random song
VELOX SOLUTION
PredictedRating
Songs
With prob. 1- ϵ serve the best predicted songWith prob. ϵ pick a random song
VELOX SOLUTION
PredictedRating
Songs
With prob. 1- ϵ serve the best predicted songWith prob. ϵ pick a random song
Epsilon Greedy
VELOX SOLUTION
PredictedRating
Songs
With prob. 1- ϵ serve the best predicted songWith prob. ϵ pick a random song
Epsilon Greedy
Active Learning Opportunity to explore new systems for
this emerging analytics workload
VELOX SOLUTION
BEYOND RECOMMENDER SYSTEMS
1. Spam and anomaly detection
BEYOND RECOMMENDER SYSTEMS
1. Spam and anomaly detection
2. Device/location specific modeling
BEYOND RECOMMENDER SYSTEMS
1. Spam and anomaly detection
2. Device/location specific modeling
3. YOUR machine learning application
BEYOND RECOMMENDER SYSTEMS
Spark Streaming Spark
SQL
Graph X ML
library
BlinkDB MLbase VeloxTraining Management + Serving
Spark
HDFS, S3, … Tachyon
ModelManager
PredictionService
THE MISSING PIECE IN BDAS
Mesos
SUMMARY
Today: model training and serving relies on ad-hoc, manual processes spread across multiple systems
SUMMARY
Today: model training and serving relies on ad-hoc, manual processes spread across multiple systems
The Velox system automatically maintains multiple models while providing low latency, fresh, and personalized predictions
SUMMARY
Today: model training and serving relies on ad-hoc, manual processes spread across multiple systems
The Velox system automatically maintains multiple models while providing low latency, fresh, and personalized predictions
Velox will be open-source: coming soon to BDAS
SUMMARY
Today: model training and serving relies on ad-hoc, manual processes spread across multiple systems
The Velox system automatically maintains multiple models while providing low latency, fresh, and personalized predictions
Velox will be open-source: coming soon to BDAShttps://amplab.cs.berkeley.edu/projects/velox/
SUMMARY
Today: model training and serving relies on ad-hoc, manual processes spread across multiple systems
The Velox system automatically maintains multiple models while providing low latency, fresh, and personalized predictions
Velox will be open-source: coming soon to BDAShttps://amplab.cs.berkeley.edu/projects/velox/[email protected]
SUMMARY
QUESTIONS?