multiple domain user personalization
DESCRIPTION
Multiple Domain User Personalization. Deepak Agarwal Yahoo! Research. Yucheng Low Carnegie Mellon University. Alexander J. Smola Yahoo! Research. Information Flood. Personalization. Golf Reader. Tech. Reader. Can we provide personalization to new users?. One Domain Cold-Start. - PowerPoint PPT PresentationTRANSCRIPT
Yucheng Low Multiple Domain User Personalization
Multiple Domain User
PersonalizationDeepak Agarwal
Yahoo! ResearchYucheng Low
Carnegie Mellon UniversityAlexander J. Smola
Yahoo! Research
Yucheng Low Multiple Domain User Personalization
Information Flood
Yucheng Low Multiple Domain User Personalization
Personalization
3
Golf Reader Tech. Reader
Can we provide personalization to new
users?
Yucheng Low Multiple Domain User Personalization
MoviesUser 1
User 2
Impossible when you have only one domain.Best you can do is to have a good baseline.
One Domain Cold-Start
Yucheng Low Multiple Domain User Personalization
Movies NewsMusic
Possible when you have many domains.
Multiple Domains Cold Start
Yucheng Low Multiple Domain User Personalization
Personalization across all domain
Combine tokens from all spaces ignoring the
source domain UserReads Golf News
Watches MTV
Golf, Tiger,Music, Song
Expand token space to include source domain
Golf:1, Tiger:1,Music:2, Song:2
Your FavoritePersonalization
Algorithm
Yucheng Low Multiple Domain User Personalization
Personalization across all domain
Combine tokens from all spaces ignoring the
source domain UserReads Golf News
Watches MTV
Golf, Tiger,Music, Song
Expand token space to include source domain
Golf:1, Tiger:1,Music:2, Song:2
Your FavoritePersonalization
Algorithm
Domains with more observations will swamp out all other domains
What is a good personalization algorithm that will work for all domains?
Yucheng Low Multiple Domain User Personalization
Solution Meta-Profile
User MetaProfile
User MusicProfile
User NewsProfile
Isolates each domain: Prevents larger domains from swamping out smaller domains.
PersonalizedNews
PersonalizedMusic
Yucheng Low Multiple Domain User Personalization
Solution Meta-Profile
User MetaProfile
User MusicProfile
User NewsProfile
User MovieProfile
Extensible: domains can be added/removed easily
Yucheng Low Multiple Domain User Personalization
Latent Dirichlet AllocationBasketbal
l NBA, hoop
Train3-point
Topic 1Golf,
Tiger, Woods, Club, Green, Hole-in-
one
Topic 2Machine,
Learning, Neural,
Network,Train
Topic 3
DocumentTopic 1Topic 2Topic 3
Michael I. Jordan trains a
Neural Network to play golf
2Golf
3Network
Yucheng Low Multiple Domain User Personalization
Latent Dirichlet Allocation
NDocument
1. Each document has a mixture over topics
2. For each word in each document
a)Draw a topicb)Draw a word from the topic
A document is a bag of words.A topic is a mixture of words.
Yucheng Low Multiple Domain User Personalization
Latent Dirichlet Allocation
NDocument
1.Each document has a mixture over topics
2. For each word in each document
a)Draw a topicb)Draw a word from the topic
A document is a bag of words.A topic is a mixture of words.
Document
Yucheng Low Multiple Domain User Personalization
Latent Dirichlet Allocation
NDocument
1. Each document has a mixture over topics
2.For each word in each documenta)Draw a topicb)Draw a word from the topic
A document is a bag of words.A topic is a mixture of words.
Document
Sample From:
Yucheng Low Multiple Domain User Personalization
Latent Dirichlet Allocation
NDocument
1. Each document has a mixture over topics
2.For each word in each documenta)Draw a topicb)Draw a word from the topic
A document is a bag of words.A topic is a mixture of words.
Topic 1: Basketball, Michael, JordanTopic 2: Golf, Tiger, Woods, Club, GreenTopic 3: Machine, Learning, Neural
Yucheng Low Multiple Domain User Personalization
Latent Dirichlet Allocation
NDocument
1. Each document has a mixture over topics
2. For each word in each documenta)Draw a topicb)Draw a word from the topic
A document is a bag of words.A topic is a mixture of words.
Topics which make upeach document
Words which make up
each topic
Yucheng Low Multiple Domain User Personalization
Single Domain Personalization
N
1. Each user has a mixture over topics 2. For each word in each
documenta)Draw a topicb)Draw a word from the topic
A user’s interaction with a domain is a bag of words.A topic is a mixture of words.
User
Words which make up
each topic
Topics each user is interested in
Yucheng Low Multiple Domain User Personalization
Multiple Domain Personalization
NUser u’s interaction with domain dUser
A user’s interaction with a domain is a bag of words.A topic is a mixture of words.
Each user has a meta-profile:Each domain has a latent matrix:
User’s prior interest in a domain is
Yucheng Low Multiple Domain User Personalization
Solution Meta-ProfileUser Meta
Profile
User MusicProfile
User NewsProfile
User MovieProfile
Yucheng Low Multiple Domain User Personalization
Users
Music
News
Movies
Topic->word table
Topic->word table
Topic->word table
Yucheng Low Multiple Domain User Personalization
Gibbs Sampling
NUser u’s interaction with domain p
LDA
Yucheng Low Multiple Domain User Personalization
Gibbs Sampling
NUser u’s interaction with domain p
Hold Constant
Sample using LDA Sampler
1: Sample
Hold Constant
Yucheng Low Multiple Domain User Personalization
Gibbs Sampling
NUser u’s interaction with domain p
Hold Constant
Hold Constant
1: Sample2: Sample
Sample Langevin Diffusion
Yucheng Low Multiple Domain User Personalization
Gibbs Sampling
NUser u’s interaction with domain p
Optimize
Hold Constant
1: Sample2: Sample 3: Optimize
Hold Constant
LBFGS
Yucheng Low Multiple Domain User Personalization
Experiments
Yucheng Low Multiple Domain User Personalization
Experiments @ Yahoo! 2 domain dataset.
Frontpage and News clicks of 5.6 million users. Frontpage/News: Article text for each click.
3 domain dataset: Frontpage, News and MyYahoo clicks of 5.6 million users. MyYahoo: Only has article IDs for each click with no text. Not semantically meaningful.
All user information was anonymized.
Yucheng Low Multiple Domain User Personalization
Test Protocol
Holdout proportion of users who see more than one domain. Hide one of those domain and try to predict the words.
Prediction metric is cosine similarityBaseline is “mean prediction”.
Yucheng Low Multiple Domain User Personalization
ImplementationDistributed implementation in C++ using Memcached for communication.
Alex Smola, Shravan Narayanamurthy “An Architecture for Parallel Topic Models” VLDB 2010
Distributed LBFGS line search: Implement standard MPI-like in Memcached.
BroadcastReduceBarrier
Takes 2-3 days for 500 iterations on 30 machines
Yucheng Low Multiple Domain User Personalization
2 Property Sanity Check
Yucheng Low Multiple Domain User Personalization
2 Property
Yucheng Low Multiple Domain User Personalization
3 Property
Yucheng Low Multiple Domain User Personalization
3 Property
Yucheng Low Multiple Domain User Personalization
sandra, oscar, oscars, red, carpet, bullock, golden, gown, bullocks, nominee, bestactress, sparkles, stunning,
vienna, bachelor, jake, pavelka, giraldi, finale, show, stars, dancing, love, season, time, abc,
bacteria, fight, super, struggling, developed, doctors, resistant, lethal, virtually, drugs, antibiotic, competitors, chad,
film, movie, movies, films, director, story, avatar, james, time, hollywood, big, make, hes, star,
Frontpage -> NewsCelebrity
Entertainment
Science
Science Fiction
Yucheng Low Multiple Domain User Personalization
iphone, apple, app, apps, ipod, google, store, apples, android, mac, mobile, touch, ipad, device, phone,
college, year, earn, years, 000, bestpaid, average, 129, colleges, graduates, ten, alums, schools, actor, likes,
health, care, bill, obama, president, rep, house, republican, senate, news, sen, democrats, fox, congress, reform
drafts, player, nfl, scouts, team, riskiest, peril, bryant, dez, pick, talented, nba, james, news,
News -> Frontpage
home, bank, facing, ceo, gomez, eviction, rosalina, bought, cleaning, foreclosed, client, janitor, offices, surprising, video,,
captured, inside, mountain, terrorist, observers, impresses, alqaidas, complexity, base, features, hideout, size, special, secret, struck,,
Politics Devices
College
Yucheng Low Multiple Domain User Personalization
Extension
User MetaProfile
User MusicProfile
User NewsProfile
Latent Dirichlet Allocation
Latent Dirichlet Allocation
User MovieProfile
Latent Dirichlet Allocation
Yucheng Low Multiple Domain User Personalization
Extension
User MetaProfile
User MusicProfile
User NewsProfile
Flexible: Allows different algorithm for each domain
Linear ModelMatrix Factorization
User MovieProfilefLDA
Yucheng Low Multiple Domain User Personalization
It Is How You Use It
User MetaProfile
User MusicProfile
Personalized withAlgorithm X
Use the Meta Profile for Initialization.
Yucheng Low Multiple Domain User Personalization
It Is How You Use It
User MetaProfile
User MusicProfile
Personalized withAlgorithm X
Periodically Update the Meta Profile and Domain Latent Matrix
Yucheng Low Multiple Domain User Personalization
ConclusionAn generic, extensible model for combining domain personalization schemes. Scalable inference procedure that extends to millions of users.Demonstrate strong predictive performance on a large real world data
Yucheng Low Multiple Domain User Personalization
Questions?