velox at sf data mining meetup

VELOX: MODELS IN ACTION

Dan Crankshaw UC Berkeley AMPLab

[email protected]

Marin Software 2015

Algorithms, Machines, and People


“deep questions over dirty and heterogenous data”

GraphX


“deep questions over dirty and heterogenous data”

BERKELEY DATA ANALYTICS STACK (BDAS)

Spark

SparkStreaming Spark SQL

BlinkDBGraphX

MLlib

MLBase

HDFS, S3, … Tachyon

Mesos Hadoop Yarn

Catify: Music for Cats

MODELING TASK

Rating

Songs

MODELING TASK

Ratings

Songs

Prediction

Catify: Music for CatsCatID Song Score

1 16 2.1

1 14 3.7

3 273 4.2

4 14 1.9


Pipeline

CatID Song Score

1 16 2.1

1 14 3.7

3 273 4.2

4 14 1.9


Tachyon + HDFS

Pipeline

CatID Song Score

1 16 2.1

1 14 3.7

3 273 4.2

4 14 1.9

Pipeline

Tachyon + HDFS

Node.js App Server

Apache Web Server



Songs

Users

Songs

Users

O(users * songs)


Pipeline

Tachyon + HDFS

Node.js App Server

Apache Web Server


Pipeline

Tachyon + HDFS

Node.js App Server

Apache Web Server

PrecomputedRatings


Pipeline

Tachyon + HDFS

Node.js App Server

Apache Web Server

PrecomputedRatings


Black box

Pipeline

Tachyon + HDFS

Node.js App Server

Apache Web Server

Training Data

PrecomputedRatings


Black box

What’s wrong?

1. Serving system: low-latency but high staleness

What’s wrong?


2. Batch training: slow incremental maintenance, no serving

What’s wrong?


2. Batch training: slow incremental maintenance, no serving

3. Ad-hoc model management

What’s wrong?

VELOX GOALS

VELOX GOALS

1. Low latency and fresh predictions

VELOX GOALS

1. Low latency and fresh predictions2. Break the abstraction: model-

specific optimizations

VELOX GOALS

1. Low latency and fresh predictions2. Break the abstraction: model-

specific optimizations3. Unified system eases operation

Spark

SparkStreaming Spark SQL

BlinkDBGraphX

MLlib

MLBase


THE MISSING PIECE IN BDAS

Mesos

Spark Streaming Spark

SQL

Graph X ML

library

BlinkDB MLbase

Training


Spark


Mesos


SQL

Graph X ML

library

BlinkDB MLbase

Training Management + Serving


Spark


Mesos


SQL

Graph X ML

library

BlinkDB MLbase VeloxTraining Management + Serving


Spark


Mesos


SQL

Graph X ML

library


Spark



Mesos


SQL

Graph X ML

library


Spark


ModelManager


Mesos


SQL

Graph X ML

library


Spark


ModelManager

PredictionService


Mesos

VELOX ARCHITECTURE

VELOX ARCHITECTUREStandalone Scala

Service


Service

Automatic Integration with Spark


Service

Personalized Predictions as a Service



Service

Shared-Nothing Serving Cluster

Personalized Predictions as a Service


SYSTEM ARCHITECTURE

uuid: 01-10

uuid: 11-20

uuid: 20-30

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

SYSTEM ARCHITECTURE

Predictions via RESTfrontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

SYSTEM ARCHITECTURE


Returns score

uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

Feedback via REST

uuid: 4

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

Feedback via REST

Model updated

in realtime

uuid: 4

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

SYSTEM ARCHITECTURE

master

workerworker

worker

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

SYSTEM ARCHITECTURE

master

workerworker

worker

Batch train RPC

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

SYSTEM ARCHITECTURE

master

workerworker

worker

Batch train RPC

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30Returns batch trained model

Mesos Mesos


Hadoop Yarn

Spark Straming Shark

SQL

Graph X ML

library

BlinkDB MLbase

Spark

VeloxTraining Management + Serving

PREDICTION SERVICE

ModelManager

PredictionService

PREDICTION API

GET /velox/catify/predict?userid=22&song=27632Simple point queries:

PREDICTION API

GET /velox/catify/predict_top_k?userid=22&k=100


More complex ordering queries:

PREDICTION API




Low-latency andscalable partitioning

Personalized Predictions

PREDICTION API




Low-latency andscalable partitioning

Personalized Predictions

Intelligent Caching

Sharing and re-use of model partial-state

SYSTEM ARCHITECTURE


uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

PREDICTION EXECUTION

def predict( u: UUID, x: Context )

uuid model



uuid model

Look up user model

Read



uuid model

Look up user model

Primary key lookup

Read



uuid model

Look up user model

Primary key lookup

Partition queries by user : always local

Read

Compute Features



user independent

}f( )

Compute Features



Feature computation could be costly

user independent

}f( )

Compute Features



Feature computation could be costly

user independent

}Cache features forreuse across users

f( )

TOP-K QUERIESQuery predicate to pre-filter candidate set

All Songs


All Songs Playlist Keywords


All Songs Playlist Keywords CandidateSongs



Score andrank allcandidates



By exploiting split model design we can leverage:






A. Shrivastava, P. Li. “Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS).” NIPS’14 Best Paper

http://nips.cc/Conferences/2014/Program/speaker-info.php?ID=9123






A. Shrivastava, P. Li. “Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS).” NIPS’14 Best Paper

Y. Low and A. X. Zheng. “Fast Top-K Similarity Queries Via Matrix Compression.” CIKM 2012



SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

SYSTEM ARCHITECTURE

frontend.js

Returns score

uuid: 01-10

uuid: 11-20

uuid: 20-30

uuid: 4

Mesos Mesos


Hadoop Yarn

Spark Straming Shark

SQL

Graph X ML

library

BlinkDB MLbase

Spark

VeloxTraining Management + Serving

ModelManager

PredictionService

MODEL MANAGER

PERSONALIZED MODELING


A Separate Model for Each User?


Computationally Inefficient many complex models



Statistically Inefficient not enough data per user

Computationally Inefficient many complex models


Input(Song) Rating

Input(Song) Rating

Split

Rating

Split

Input(Song)

PERSONALIZED SPLIT MODEL

Input(Song)


Input(Song)

Shared Basis Feature Model


Input(Song)

Shared Basis Feature ModelBig Data


Input(Song)


Changes Slowly


Input(Song)


Changes SlowlyTrain in Batch!


Input(Song)



PersonalizedUser Model


Input(Song)



Small Data



Input(Song)



Small DataChanges Quickly



Input(Song)



Small DataChanges Quickly

Train Online!


Input(Song)




Input(Song)


Input(Song)



Input(Song)


Meow

Input(Song)




Meow

Input(Song)Input

(Song)




Meow

Terrible

Input(Song)Input

(Song)



MATHEMATICAL FORMULATION

Input(Song)


Input(Song)

x

Shared BasisFeature Models

Changes slowly


Input(Song)

x


Changes slowly


Input(Song)

f(x; ✓)x



Changes slowly

Highly dynamic


Input(Song)

f(x; ✓)x



Changes slowly

Highly dynamic


Input(Song)

f(x; ✓) ·wu

x



Changes slowly

Highly dynamic

= Rating


Input(Song)

f(x; ✓) ·wu

x

Meow

FEEDBACK API

POST /velox/catify/observe?userid=22&song=27&score=3.7

Simple direct value feedback:

FEEDBACK API



Continuously update user models in Velox

Online Learning

FEEDBACK API




Online Learning Offline LearningLogged to Tachyon for

feature learning in Spark

FEEDBACK API




Online Learning Offline LearningLogged to Tachyon for

feature learning in Spark

EvaluationContinuously assessmodel performance

SYSTEM ARCHITECTURE

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

Feedback via REST

Model updated

in realtime

uuid: 4

ONLINE LEARNING

velox.jar

user model

def observe(u: UUID, x: Context, y: Score)

ONLINE LEARNING

velox.jar

user model


Update user model with new

training data Write

ONLINE LEARNING

velox.jar

user model


Stochastic gradient descent


training data Write

ONLINE LEARNING

velox.jar

user model


Stochastic gradient descent

Incremental linear algebra


training data Write

SYSTEM ARCHITECTURE

master

workerworker

worker

Batch train RPC

frontend.js

uuid: 01-10

uuid: 11-20

uuid: 20-30

OFFLINE OR NEARLINE LEARNING

def retrain(trainingData: RDD)

Spark BasedTraining Algs.

wu · f(x; ✓)

Automated retraining policies

Efficient batch training using Spark

Incremental learning using Spark Streaming

Data Model

Data Model

Sample Bias: model affects the training data.

ALWAYS SERVE THE BEST SONG?

Songs

PredictedRating

VELOX SOLUTION

PredictedRating

Songs

With prob. 1- ϵ serve the best predicted song

PredictedRating

Songs

With prob. 1- ϵ serve the best predicted songWith prob. ϵ pick a random song

VELOX SOLUTION

PredictedRating

Songs


Epsilon Greedy

VELOX SOLUTION

PredictedRating

Songs


Epsilon Greedy

Active Learning Opportunity to explore new systems for

this emerging analytics workload

VELOX SOLUTION

BEYOND RECOMMENDER SYSTEMS

1. Spam and anomaly detection



2. Device/location specific modeling



2. Device/location specific modeling

3. YOUR machine learning application



SQL

Graph X ML

library


Spark


ModelManager

PredictionService


Mesos

SUMMARY

Today: model training and serving relies on ad-hoc, manual processes spread across multiple systems

SUMMARY

https://amplab.cs.berkeley.edu/projects/velox/

mailto:[email protected]


The Velox system automatically maintains multiple models while providing low latency, fresh, and personalized predictions

SUMMARY





Velox will be open-source: coming soon to BDAS

SUMMARY





Velox will be open-source: coming soon to BDAShttps://amplab.cs.berkeley.edu/projects/velox/

SUMMARY





Velox will be open-source: coming soon to BDAShttps://amplab.cs.berkeley.edu/projects/velox/[email protected]

SUMMARY



QUESTIONS?

velox at sf data mining meetup

Data & Analytics

peopledeep questions

low latency

catspipeline tachyon

heterogenous dataalgorithms

catspipeline catid song

catsblack boxpipeline

tachyonthe missing piece

catscatid song score1