browsemap: collaborative filtering at linkedin
DESCRIPTION
Many web properties make extensive use of item-based collaborative filtering, which showcases relationships between pairs of items based on the wisdom of the crowd. This paper presents LinkedIn’s horizontal collaborative filtering infrastructure, known as browsemaps. The platform enables rapid development, deployment, and computation of collaborative filtering recommendations for al- most any use case on LinkedIn. In addition, it provides centralized management of scaling, monitoring, and other operational tasks for online serving. We also present case studies on how LinkedIn uses this platform in various recommendation products, as well as lessons learned in the field over the several years this system has been in production.TRANSCRIPT
Recruiting Solutions Recruiting Solutions Recruiting Solutions 1
Browsemap: Collaborative Filtering At LinkedIn
Lili Wu, Sam Shah, Sean Choi, Mitul Tiwari, Christian Posse
RSWeb 2014 with RecSys
2
Agenda § Motivation § Architecture § Applications § Lessons Learned
3
Profile Browsemap: People who viewed this profile also viewed… Count co-views
Collaborative filtering for member profile
4
Collaborative filtering for job page
Job Browsemap: People who viewed this job also viewed… Count co-views
5
company group portfolio
… many CF based recommenders
6
• Many different entities
• Similar problems with different requirement • Fast product development cycle
• Hybrid recommender systems
• Handle LinkedIn data volume and traffic
Challenges
7
Challenges
è Horizontal Platform
• Many different entities
• Similar problems with different requirement • Fast product development cycle
• Hybrid recommender systems
• Handle LinkedIn data volume and traffic
8
Browsemap
Collaborative Filtering Platform at LinkedIn
9
Browsemap Platform
• Scalability Ø Online/offline architecture Ø Hundreds of millions of entities, billions of
monthly page views • Browsemap Domain Specific Language (DSL)
Ø Code reuse through modular components Ø Flexible computation workflow construction
• Data are used by hybrid recommenders
10 10 10
Browsemap Architecture
HDFS
User Activity
Data
Frontend Services
Results Queries
Hadoop
Browsemap Engine
Browsemap DSL Online
Query API
Key-value storage
Voldemort
11 11 11
Browsemap Architecture
HDFS
Frontend Services
Results Queries
Hadoop
Browsemap Engine
Browsemap DSL Online
Query API
Key-value storage
Voldemort
User Activity
Data High Throughput
12 12 12
Browsemap Architecture
HDFS
Frontend Services
Results Queries
Hadoop
Browsemap Engine
Browsemap DSL Online
Query API
Key-value storage
Voldemort
User Activity
Data Low Latency
13
Browsemap Domain Specific Language (DSL)
Module Collection
Co-view counting
Spam User Filtering
Expired Job Filtering
Expired Job Filtering
Cold-start techniques
Co-view counting
…
Cold-start techniques
… Job browsemap
���
Job ��� Company
…
Spam User Filtering
Co-view counting
…
Cold-start techniques
…
Spam User Filtering
Company browsemap
14
• Support all entity types • Adjust to each product requirement
• Scale
Recap
Voldemort
15
Agenda ü Motivation ü Architecture § Applications § Lessons Learned
16 16 16
Applications – CF based recommenders Profile Browsemap
Portfolio Browsemap
Job Browsemap Group Browsemap
Hiring Browsemap
Company Browsemap
Influencer Browsemap
17 17 17
Applications – Hybrid Recommender Systems
Suggested Profile Update
Swee Lim
18 18 18
Applications – Hybrid Recommender Systems
Suggested Profile Update
Goal: for each member,
find companies he may want to follow
19 19 19
Applications – Hybrid Recommender Systems
Google Cisco Member followed companies
Linkedin, Facebook
Juniper, Arista Companies user may
be interested in
…
…
Member info: • Content-based features
title, industry, location, … • Collaborative filtering feature
Co-follow Browsemaps: People who follow this company also follow these companies
20 20 20
Applications – Hybrid Recommender Systems
Question: For a company C, will member M like it?
Approach: Logistic regression Features:
member location company location 1 if yes, 0 if no
company is in the list of the co-follow browsemaps ? 1 if yes, 0 if no
…
21 21 21
Applications – Hybrid Recommender Systems
Collaborative Filtering is important: • Surface implicit connection between companies • Based on Member’s preference
22
Agenda ü Motivation ü Architecture ü Applications § Lessons Learned
Lesson 1: Tall oaks grow from little acorns
23
Lesson 1: Tall oaks grow from little acorns
24
Lesson 1: Tall oaks grow from little acorns
25
Lesson 1: Tall oaks grow from little acorns
26
A generic horizontal platform is essential
Lesson 2: One hand washes the other
27
Job Browsemap
Similar Jobs
Collaborative filtering: “Follower audience”
Content based: “Leader audience”
Lesson 3: You can’t get blood out of a stone
28
Job 1 Job 2 Job 3 (new)
Need to handle cold start problem
(view time)
merge
Leverage Browsing History Personalized Backfill
Lesson 4: A chain is only as strong as its weakest link
29
CF: Relies solely on user activities Good data is crucial
§ Mistakes can be hard to detect / debug
§ Simple mistakes can have big impact e.g. “jobid” à “id”
§ Need prevention mechanism Ø Improve tracking Ø Unit test Ø Browsemap platform data-check :
input volume, coverage/metrics analysis
Lesson 5: User experience matters
50% CTR
30
500% more applications
ª Put recommendations in user’s flow
31
§ Collaborative filtering is important for LinkedIn
§ Browsemap is in production for 3+ years § Horizontal platform is crucial
Conclusion
32
§ Questions?
Thank you !