florian douetteau @ dataiku

32
write your own data story!

Upload: papisio

Post on 06-Aug-2015

368 views

Category:

Data & Analytics


6 download

TRANSCRIPT

Page 1: Florian Douetteau @ Dataiku

write your own data story!

Page 2: Florian Douetteau @ Dataiku

Using Historical Logs of a search engine QUERIES RESULTS CLICKS

and a set of new QUERIES and RESULTS

rerank the RESULTS in order to optimize relevance

Personalized Web SearchFri 11 Oct 2013 – Fri 10 Jan 2014

194 Teams $9,000 cash prize

34,573,630 Sessions with user id 21,073,569 Queries 64,693,054 Clicks

~ 15GB

Page 3: Florian Douetteau @ Dataiku

A METRIC FOR RELEVANCE RIGHT FROM THE LOG? ASSUMING WE SEARCH FOR "FRENCH NEWSPAPER", WE TAKE

A LOOK AT THE LOGS.

Page 4: Florian Douetteau @ Dataiku

WE COMPUTE THE SO CALLED DWELL TIME OF A CLICK I.E. THE TIME ELAPSED BEFORE THE NEXT ACTION

DWELL TIME

Page 5: Florian Douetteau @ Dataiku

DWELL TIME HAS BEEN SHOWN TO BE CORRELATED WITH THE RELEVANCE

Page 6: Florian Douetteau @ Dataiku

GOOD WE HAVE A MEASURE OF RELEVANCE ! CAN WE GET AN OVERALL SCORE FOR OUR SEARCH ENGINE

NOW?

Page 7: Florian Douetteau @ Dataiku

Emphasis on relevant documents

Discount per ranking

Discount Cumulative Gain

Just Normalize Between 0 and 1

Normalized Discount Cumulative Gain

Page 8: Florian Douetteau @ Dataiku

PERSONALIZED RERANKING IS ABOUT REORDERING THE N-BEST RESULTS BASED ON

THE USER PAST SEARCH HISTORY

Results Obtained in the contest:

Original NCDG 0.79056

ReRanked NCDG 0.80714

~ Raising the rank of a relevant ( relevancy = 2) result from Rank #6 to Rank #5 on each query

~ Raising the rank of a relevant ( relevancy = 2) result from Rank #6 to Rank #2 in 20% of the queries

Equivalent To

Page 9: Florian Douetteau @ Dataiku

No researcher. No experience in reranking.

Not much experience in ML for most of us. Not exactly our job. No expectations.

Kenji Lefevre 37

Algebraic Geometry Learning Python

Christophe Bourguignat 37

Signal Processing Eng. Learning Scikit

Mathieu Scordia 24

Data Scientist

Paul Masurel 33

Soft. Engineer

The Team

Page 10: Florian Douetteau @ Dataiku

A-Team?

Page 11: Florian Douetteau @ Dataiku

Data Hobbits

Page 12: Florian Douetteau @ Dataiku

Understanding The Problem

Page 13: Florian Douetteau @ Dataiku

53% OF THE COMPETITORS COULD NOT IMPROVE THE BASELINE

Worse 53%

Better 47%

Page 14: Florian Douetteau @ Dataiku

1. compute non-personalized rank 2. select 10 best hits and serves them in order 3. re-rank using log analysis. 4. put new ranking algorithm in prod (yeah right!) 5. compute NDCG on new logs 6. … 7. Profits !!

IDEAL SETUP

Page 15: Florian Douetteau @ Dataiku

1. compute non-personalized rank 2. select 10 bests hits 3. serve 10 bests hits ranked in random

order 4. re-rank using log analysis, including non-

personalized rank as a feature 5. compute score against the log with the

former rank

REAL SETUP

1. compute non-personalized rank 2. select 10 best hits and serves them in order 3. re-rank using log analysis. 4. put new ranking algorithm in prod (yeah right!) 5. compute NDCG on new logs 6. … 7. Profits !!

IDEAL

Page 16: Florian Douetteau @ Dataiku

Users tend to click on the first few urls. User satisfaction metric is influenced by the display rank.

Our score is not aligned with our goal.

PROBLEM

We cannot discriminate the effect of the signal of the non-personalized rank from effect of the display rank

Page 17: Florian Douetteau @ Dataiku

PROMOTES OVER CONSERVATIVE RE-RANKING POLICY

Even if we know for sure that the url with rank 9 would be clicked by the user if it was presented at rank 1, it would be probably a bad idea to rerank it to rank 1 in this contest.

Average per session of the max position jump

Page 18: Florian Douetteau @ Dataiku

Simple, point wise approach

Session 1 Session 2 ....0

1

2

For each (URL, Session) predict relevance (0,1 or 2)

Page 19: Florian Douetteau @ Dataiku

Supervised Learning on History

We split 27 days of the train dataset 24 (history) + 3 days (annotated).

Stop randomly in the last 3 days at a “test" session (like Yandex)

Train Set (24 history)

Train Set (annotation) Test Set

Page 20: Florian Douetteau @ Dataiku

How They Did It

Page 21: Florian Douetteau @ Dataiku

Features Construction : Team Member work independantly

Split Train & Validation

Page 22: Florian Douetteau @ Dataiku

The Existing Rank (base rank) Revisits (Query-(User)-URL) features and variants

Query Features Cumulative Features

User Click Habits Features Collaborative Filtering Features

Seasonality Features

FEATURES

Page 23: Florian Douetteau @ Dataiku

In the past, when the user was displayed this url, with the exact same query what is the probability that :

REVISITS

• satisfaction=2 • satisfaction=1 • satisfaction=0 • miss (not-clicked) • skipped (after the last click)

5 Conditional Probability Features

1 An overall counter of display 4 mean reciprocal rank (kind of the harmonic mean of the rank) 1 snippet quality score (twisted formula used to compute snippet quality)

11 Base Features

Page 24: Florian Douetteau @ Dataiku

• (In the past|within the same sesssion), • (with this very query | whatever query | a subquery | a super query) • and was offered (this url/this domain)

MANY VARIATIONSX2X 3X 2

12 variants

With the same user

Without being the same user ( URL - query features)

• Same Domain • Same URL • Same Query and Same URL

3 variants

15 Variants X 11 Base Features

165 Features

Page 25: Florian Douetteau @ Dataiku

Features Construction : Team Member work independantly

Learning : Team Member work independantly

Split Train & Validation

> 200 Potential Features on 30 days

Labelled 30 days data

Page 26: Florian Douetteau @ Dataiku

Short Story

Point Wise, Random Forest, 30 Features, 4th Place (*)

List Wise , LambdaMART, 90 Features, 1st Place (*)

(*) A Yandex “PaceMaker" Team was also displaying results on the leaderboard and were at the first place during the whole competition even if not officially contestant

Trained in 2 days, 1135 Trees

Optimize & Train in ~ 1 hour (12 cores), 24 trees

Page 27: Florian Douetteau @ Dataiku

Original Ranking Re Ranked

13 errors 11 errors

High Quality Hit

Low Quality Hit

Rank Net Gradient

LambdaRank "Gradient"

From RankNet to LambdaRank to LambdaMART: An Overview

Christopher J.C. Burges - Microsoft Research Technical Report MSR-TR-2010-82

Lambda Mart

Gradient Boosted Trees with a special gradient called

“Lambda Rank"

Page 28: Florian Douetteau @ Dataiku

Grid SearchWe are not doing typical classification here. It is extremely important to perform grid

search directly against NDCG final score.

NDCG “conservatism” end up with large “min samples per leaf” (between 40 and 80 )

Page 29: Florian Douetteau @ Dataiku

Feature SelectionTop-Down approach : Starting from

a high number of features, iteratively removed subsets of features. This approach led to the subset of 90 features for the LambdaMart winning solutions

(Similar strategy now implemented by sklearn.feature_selection.RFECV)

Bottom-up approach : Starting from a low number of features, add the features that produce the best marginal improvement. Gave the 30 features that lead to the best solution with the point-wise approach.

Page 30: Florian Douetteau @ Dataiku

Take Away

Set up a Valid and Solid Cross Validation scheme

Prototype with fast ML methods, optimize with boosting

Be systematic in terms of feature selection

Setup a reproductible workflows early on

Split tasks when running as a team

Page 31: Florian Douetteau @ Dataiku

Special Offer

We offer a free server (with DSS) for teams running on Kaggle Competitions

Conditions: - Be at least 3 people - Up to 3 three teams Max. sponsored per competition

[email protected] DOUETTEAU

[email protected]

Page 32: Florian Douetteau @ Dataiku

http://sourceforge.net/p/lemur/wiki/RankLib/Ranklib ( Implementation of LambdaMART)

These Slides http://www.slideshare.net/Dataiku

Learning to rank using multiple classification and gradient boosting.

P. Li, C. J. C. Burges, and Q. Wu. Mcrank - In NIPS, 2007

From RankNet to LambdaRank to LambdaMART: An Overview

Christopher J.C. Burges - Microsoft Research Technical Report MSR-TR-2010-82

http://fumicoton.com/posts/bayesian_ratingBlog Post About Additive Smoothing

Blog Posts about the solution

Contest Url

Paper with Detailed Description

http://blog.kaggle.com/2014/02/06/winning-personalized-web-search-team-dataiku/http://www.dataiku.com/blog/2014/01/14/winning-kaggle.html

http://research.microsoft.com/en-us/um/people/nickcr/wscd2014/papers/wscdchallenge2014dataiku.pdf

https://www.kaggle.com/c/yandex-personalized-web-search-challenge

Research Papers

References