using implicit preference relations to improve content-based recommendations, ec-web 2015

18
Using Implicit Preference Relations to Improve Content Based Recommending Ladislav Peška and Peter Vojtáš Department of Software Engineering, Charles University in Prague, Czech Republic

Upload: ladislav-peska

Post on 22-Jan-2018

285 views

Category:

Software


0 download

TRANSCRIPT

Using Implicit Preference

Relations to Improve Content

Based Recommending

Ladislav Peška and Peter Vojtáš

Department of Software Engineering,

Charles University in Prague,

Czech Republic

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

2

Recommender Systems

Propose relevant items to the right persons at the right

time

Machine learning application

Expose otherwise hard to find, uknown items

Complementary to the catalogues, search engines etc.

„Win-win strategy“

EC-WEB 2015, Valencia

User Feedback rating, clickstream,

time on page, buys…

User, Object Profiles Object attributes

(Context) Time, location,

Possible choices…

RECOMMENDER

SYSTEM

Top-K Recommended objects

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

3

Recommender Systems

User feedback

Explicit feedback (rating)

Implicit feedback (user behavior)

Dwell time, clickstream, scrolling, mouse moves etc.

Often used as a proxy to the user rating

Recommending algorithms

Collaborative filtering

(Users A and B were similar so far, the should like similar things in the future too)

Cold start problem

Content-based filtering

(User A should like similar items to the ones he liked so far)

Overspecialization, lack of diversity, obvious recommendations…

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

4

Challenge

Recommending for small e-commerce websites

Tens of similar vendors, user can choose whichever she likes

(Almost) no explicit feedback

(No incentives for users)

Few visited pages (Often usage of external search engines & landing on object details)

Low user loyalty (New vs. Returning visitors ratio 80:20)

Not enough data for collaborative filtering

Focus on Implicit Feedback & Content-based recommendations

Gather as much as possible user feedback; the sooner the better

Gather external content to improve CB recommendations (other papers)

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

5

User Feedback

Explicit feedback (provided via website GUI)

Rating an object via Likert Scale

Comparing objects explicitly is not so common

Implicit feedback (Virtually any JS event could be used)

Actions related to evaluation of a single object

Dwell time on the object detail page

Number of page views

Scrolling, mouse events

Select / copy text, printing, purchase process etc.

Actions related to evaluation of a list of objects

Analyze user behavior on the category pages,

search results etc.

Search related actions etc.

EC-WEB 2015, Valencia

A B or

Results

Selected object IDs:

1,4

Ignored object IDs:

2,3,5,6,7,8

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

6

Our Working Hypothesis

Users are often evaluating lists of objects

Search results, category pages, recommended items etc.

If user selects some objects from the list, we take it as an

evidence of his/her positive preference.

User prefers selected object(s) more, than other displayed &

ignored objects

We can form preference relations:

IPRrel (selected obj. > ignored obj.)

We can extend such relations along the content-based

similarity of objects

Some objects could be ignored, because user was not

aware of them, not becouse he/she did not like them

E.g. they were displayed below the visible area

EC-WEB 2015, Valencia

>

>

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

7

Outline of Our Approach

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

8

Collecting User Behavior

IPIget component for collecting user behavior Browser visible area size

List of all objects and its positions on the page

Listener on Scrolling events

Compute visible time for each displayed object, use it as a proxy to

the level of user evaluation

Some more refined approaches are possible (e.g. registering mouse moves or

visual focus for different quadrants)

Listener on Clicking events (which object(s) were selected by the user)

IPIget component download: http://ksi.mff.cuni.cz/~peska/ipiget.zip

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

9

Collecting User Behavior – Example

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

10

Extending IPR Relations

IPR(Ox,Oy,intx,y)

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

11

Using IPR to Reranking List of

Objects

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

12

Using IPR to Reranking List of

Objects - Algorithm

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

13

Using IPR to Reranking List of

Objects – Conflict Strategies

IPR(O4>O2): O4 is better than O2

Forward:

Move O4 just before O2

Do not miss relevant objects

Backward:

Move O2 just after O4

Do not show irrelevant objects

Swap:

Change positions of O4 and O2

Keep objects well separated

EC-WEB 2015, Valencia

O1

O2

O3

O4

O5

O6

+ IPR(O4,O2,int)

O1

O4

O2

O3

O5

O6

O1

O3

O4

O2

O5

O6

O1

O4

O3

O2

O5

O6

Forward Backward Swap

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

14

Our Approach - Example

EC-WEB 2015, Valencia

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

15

Experiments

Off-line experiments on Czech secondhand bookshop dataset

1760 users, train set (2/3 of user data), test set (1/3)

Recommender systems tries to predict visited objects Vector Space Model (VSM) with TF-IDF & Cosine similarity

SimCat (recommending similar categories based on Collaborative Filtering)

Stochastic Gradient Descent Matrix Factorization (SGD MF)

nDCG and Presence@top-k metrics

EC-WEB 2015, Valencia

Method nDCG p@5 p@10 p@50

VSM + best IPR-rerank (sim:0.5, int:0.1, swap) 0.475 13.6% 15.7% 20.7%

VSM 0.464 13.2% 15.1% 19.6%

Best IPR-rank (sim:0.5, int:0.1, swap) 0.247 7.1% 7.7% 8.5%

SimCat + best IPR-rerank (sim:0.01, int:0.1, forward) 0.219 4.7% 6.3% 10.0%

SimCat 0.136 0.9% 1.5% 5.4%

SGD MF (500 lat. factors, max 500 iterations) 0.126 0.89% 1.2% 3.3%

Random recommendations 0.085 0.09% 0.14% 0.27%

MinSimilarity threshold, VSM

0.2 0.3 0.5 0.8

0.465 0.470 0.473 0.472

Conflict resolving, VSM

Forward Backward Swap

0.465 0.460 0.466

Conclusions, Future Work

Implicit feedback could be more than just a substitution for user rating

Collecting feedback on list of objects could give us insight about user decision

proces

We used user behavior on list of objects to create Implicit

Preference Relations (IPR) between selected and ignored objects

IPR can be extended along the object similarity axis

We shown algorithm to update linear list of objects with IPRs

IPR re-ranked recommendations outperformed original ones in an off-line

experiment

Open Problems, Challenges

How much was object really evaluated by the user? (Going beyond visibility)

Which object features makes it desirable for the user? (Tailored object similarities)

On-line deployment

EC-WEB 2015, Valencia Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

16

EC-WEB 2015, Valencia Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

17

Thank you!

Questions, comments?

Peska, Vojtas. Using IPR to Improve Content-

Based Recommending

18

Recommending in Czech

Second-hand Bookshop

Mostly single item in stock

Few content-based attributes (low information value)

- Title, author, price, category, textual description

- Hard to define informative attributes

- Title (and author name) in Czech

- No common book identifier

(ISBN mostly inapplicable)

No explicit feedback

Page-view, time on page, buys…

Users identified through cookies

Approx. 9500 active books

50-100 visitors / day

2-4 purchases

EC-WEB 2015, Valencia

RECOMMENDED

OBJECTS

CA

TE

GO

RIE

S

Attributes

search

CATALOGUE