beyond ratings and followers (recsys 2012)

Post on 28-Aug-2014

8.381 Views

Category:

Technology

33 Downloads

Preview:

Click to see full reader

DESCRIPTION

Industry session talk on LinkedIn Recommender System @ RecSys 2012

TRANSCRIPT

Anmol BhasinSr. Manager

Analytics Engineeringwww.linkedin.com

Beyond Ratings

& Followers

Linkedin

In a social (professional) networking context,

its about building a..

Recommender Ecosystem

4

50%

The answer is

Similar Profiles

Events You May Be Interested In

News

The Recommender Ecosystem

11

Network updates

Connections

Frameworks are revolutions

evolutions

LinkedIn Recommendation Engine

BehaviorAnalysis

CollaborativeFiltering Popularity

Sim

ilar

Pro

files

Ref

erra

l C

ente

r

Tale

ntM

atch

Peo

ple

Bro

wse

Map

People

Recommen-dation Types

Shared, Dynamic,Unified

CoreService

Products

RecommendationEntities

Jobs

Bro

wse

M

ap

Sim

ilar J

obs

Jobs

Jobs

You

M

ay b

e in

tere

sted

in

… AdsCompaniesSearchesNewsEvents… and more

GY

ML

Gro

ups

Bro

wse

Map

Groups

Sim

ilar G

roup

s

User Feedback

API

(R-T) Feature Extraction, Entity Resolution & Enrichment

(R-T) matching computations

A/B

Offline data munging (hadoop)

different strokes for different

folks

Cloning

Possible Approaches

Naïve K Nearest Neighbor solution Complexity is

Clustering Latent Factor Models like PLSI or LDA Hierarchical Agglomerative clustering

Self Organizing Maps

Item based Collaborative Filtering Find pairs of Users viewed in the same session

Scale 175+ M profiles

Dimensionality ~2M companies ~200K schools ~147 industries ~200 countries ~25K titles ~40K Skills ~200 Job Functions

Similar means different things to different people Similar Behavior doesn’t mean you can replace me at my job Accuracy vs Relevance (me & my boss.. )

Realtime.. It’s a problem of accuracy.. Not recall..

Challenges

Approach

Rank

FILTER

Cluster

Focus attention only on pairs likely to be similar

Filter out the possibly dis-similar pairs

Run Similarity Functions on filtered in pairs

LSH function family for Cosine Distance

Locality Sensitive Hashing

Approach

Rank

FILTER

Cluster

Focus attention only on pairs likely to be similar

Filter out the possibly dis-similar pairs

Run Similarity Functions on filtered in pairs

Similarity Functions

Different bands of attributes Boolean, Jaccard or Cosine Similarities across attribute

pairs.

• Logisitic Regression with Elastic Penalty

Learn model params on a set of hand labeled data points

Predicted value interpreted as score

Impedance Mismatch

Ad Ranking Given

Objective

Goal: Increase revenue Respect daily budgets of Advertisers Good user experience

Campaign creation

Virtual Profiling

Targeted Segment Population

Title : Eng MgrCompany : LinkedInLocation : CA,USA Skills : ML, RecSys

Title : Vice PresidentCompany : TwitterLocation : CA,USA Skills : DM, ML, RecSys ……………….

Virtual Profiling

Title : Eng MgrCompany : LinkedInLocation : CA,USA Skills : ML, RecSys

Title : Sr. SECompany : GoogleLocation : PA, USASkills : ML, DMTitle : Eng DirCompany : LinkedinLocation : PA, USASkills : ML, Stats, DM

Title : Sr. SE<1>, Eng Mgr<1>, Eng Dir<1>

Company : LinkedIn<2>,

Google<1>,

Location : CA,USA <2>, PA, USA<1>

Skills : ML<2>,

RecSys<1>, Stats<1>, DM<1>

Clicker Feature Distribution

Virtual Profiling

Information Gain

Pick Top K overrepresented features from the clicker distribution vs the target

segment

A representative projection of the item in the member feature space

CTR Prediction – CF Similarity

RankerMEMBER FEATURES

Score to pCTR correction

L2 regularized Logistic Regression (Liblinear, VW, Mahout, ADMM)

For new ad creatives back-off to the advertiser / ad category nodes till they reach critical impression/click volume (explore/exploit)

AD CREATIVE VIRTUAL PROFILE

Creative features

the magic is in the models

features

30

Feature Engineering – Entity Resolution

Companies

Huge impact on the business and UE Ad targeting TalentMatch Referrals

‘IBM’ has 8000+ variations- ibm – ireland- ibm research- T J Watson Labs- International Bus. Machines- Deep Blue

K-Ambiguous

Asonam’11, KDD’11

Open to relocation ? Region similarity based on profiles or network Region transition probability

predict individuals propensity to migrate and most likely migration target

Impact on job recommendations 20% lift in

views/viewers/applications/applicants

Feature Engineering – Sticky Locations

32

What should you transition to .. and when ?

Months since graduation

Prob

abili

ty o

f sw

itch

rethinking delivery

Social Referral

Social Referral

Mohammad Amin, Baoshi Yan, Sripad Sriram, Anmol Bhasin, Christian Posse. Social Referral : Using network connections to deliver

recommendations. To appear in Proceedings of the Sixth ACM conference on Recommender systems (RecSys '12)

> 2X Conversion

Linkedin Group: Text Analytics

I found this group interesting, and I think you will too

Deepak

Linkedin Group: Text Analytics

From: Deepak Agarwal – Engineering Director, LinkedIn

2X conversion

Big Data A/B is the

new

Orthogonality in A/B

383838

1. Novelty effect E.g., new job recommendation

algorithms have week-long novelty effect that shows lifts twice the stationary (real) one

2. Cannibalization Zero-sum game or real lift?

3. Random sampling destroys network effect

Beware of some A/B testing pitfalls

1 week lifts 2weeks lifts

Tech Stack

Open Source Technologies

ZoieBobo

KafkaVoldemort

40http://data.linkedin.com

It takes a village

Credits

Engineering : Abhishek Gupta, Adam Smyczek, Adil Aijaz, Alan Li, Baoshi Yan, Bee-Chung Chen, Deepak Agarwal, Ethan Zhang, Haishan Liu, Igor Perisic, Jonathan Traupman, Liang Zhang, Lokesh Bajaj, Mario Rodriguez, Mitul Tiwari, Mohammad Amin, Monica Rogati, Parul Jain, Paul Ogilvie, Sam Shah, Sanjay Dubey, Tarun Kumar, Trevor Walker, Utku Irmak

Product : Andrew Hill, Christian posse, Gyanda Sachdeva, Mike Grishaver, Parker Barrile, Sachit Kamat Alphabetically sorted

You

Picture yourself with this New Job:

Applied Researcher /Research Engineer

A Recommendation for you..

?Contact:

abhasin@linkedin.com

http://data.linkedin.com/

top related