how to interpret implicit user feedback

How to Interpret Implicit User Feedback?Ladislav Peška

Department of Software EngineeringCharles University in Prague, Czech Republic

Peter VojtášDepartment of Software Engineering

Charles University in Prague, Czech Republic

ABSTRACTWe focus on interpreting user preference from his/herimplicit behavior. There are many Relevant Behavior Types(RBT) e.g. dwell time, scrolling, clickstream etc. RBTs variesboth in quality and occurrence and thus we might needdifferent approaches to process them.In this early work we focus on how to interpret each RBTseparately. We selected number of common RBTs andproposed several approaches to interpret RBT values asuser rating. We conducted series of off-line experimentsand A/B test on the real-world users of a Czech travelagency. The experiments, although preliminary, showedimportance of considering multiple RBTs and variousmethods to threat them.METHODS FOR INTERPRETTING USER BEHAVIOR

• BINARY: for all visited objects set 𝑟 = 1 (baseline)

• LINEAR: user based linear normalization (the more the better)

• COLLABORATIVE (did users with similar RBT purchased the product?)

Select user visits with similar RBT- KNN: use k nearest neighbors according to the RBT- Distance: use all records from interval around RBTCompute purchase ratio, use sigmoid function

• COMBINED (single rating based on multiple RBTs)

𝑟 = 𝑙𝑛 𝐷𝑤𝑒𝑙𝑙𝑇𝑖𝑚𝑒 + 1 + 𝑙𝑛 𝑆𝑐𝑟𝑜𝑙𝑙𝑇𝑖𝑚𝑒 + 1 + 𝑙𝑛(𝑀𝑜𝑢𝑠𝑒𝑇𝑖𝑚𝑒 + 1)

RESULTS OF A/B TESTING

OFFLINE EVALUATION:Pairwise comparison of purchased and non-purchasedobjects visited by each user. 8400 pairs of objects from 380users with 450 purchases. Counting pairs ordered correctly,incorrectly and with the same value.

𝑄𝑝𝑎𝑖𝑟 ≔ 𝛼 ×𝐶𝑜𝑟𝑟𝑒𝑐𝑡

𝐴𝑙𝑙 𝑝𝑎𝑖𝑟𝑠+ (1 − α) ×

𝐶𝑜𝑟𝑟𝑒𝑐𝑡 + 𝐸𝑞𝑢𝑎𝑙

𝐴𝑙𝑙 𝑝𝑎𝑖𝑟𝑠

LESSONS LEARNED and POSSIBLE EXTENSIONSIt is important to consider more refined implicit user feedback, than simple Binary visits.- We proposed several methods to transform raw implicit feedback (RBTs) into a user rating- The methods succeeded in the off-line experiments, however we are not yet able to confirm it in A/B testing (longer experiments needed)

There are multiple options to combine 𝑟𝑖 ratings based on a single RBT e.g. weighting scheme, priorization, T-(co)norms.

Also many RBTs were neglected in this study and other recommending algorithms should be also considered in future work.

USED RELEVANT BEHAVIOR TYPES

RBT Triggered event Coverage

Pageview JavaScript Load() 99%

Mouse JavaScript MouseOver() 44%

ScrollTime JavaScript Scroll() 49%

DwellTime Total time spent on page 69%

Purchase Object was purchased 0.5%

RBT, α=0.5 LINEAR DIST, ε=0.2 DIST, 0.9 KNN, 0.01 KNN, 0.7

Pageview 0.797 0.695 0.850 0.753 0.825

Mouse 0.772 0.561 0.799 0.695 0.822

Scroll 0.569 0.555 0.578 0.582 0.573

DwellTime 0.791 0.502 0.589 0.632 0.649

0,2%

0,4%

0,6%

0,8%

1,0%

1,2%

1 2 3 4 5 6 7 8 9 10

Clic

k Th

rou

gh R

ate

Minimal Number of Visited Objects

BINARY

LINEAR (avg)

BEST OFFLINE (avg)

COMBINED

17 days of real deployment (ongoing), 4 groups of users, in total 4700 users, 135K recommended objects (6 or 12

objects in list), 1260 clicks.

VSM recommender system, Click through rate.

For users with more visited objects, Best Offlinemethod gets the best results

EXAMPLE: User U visited object O1, O2, O3:

User U PAGEVIEW DWELL TIME SCROLL TIME

O1 1 10 sec 0 sec

O2 1 60 sec 0 sec

O3 2 200 sec 10 sec

Binary

user rating 𝑟: 𝑟𝑂1 = 1 𝑟𝑂2 = 1 𝑟𝑂3 = 1

Linear

𝒓𝒊 PAGEVIEW DWELL TIME SCROLL

O1 0.5 0.05 0O2 0.5 0.3 0O3 1 1 1

user rating 𝑟 (AVG of local ratings):

𝑟𝑂1 = 0.183, 𝑟𝑂2 = 0.267, 𝑟𝑂3 = 1

local ratings 𝑟𝑖 for each RBT normalized to user max

Based on different interpretation method, the same recommender (VSM) shouldpropose different list of objects to the user

regular usernew user

Best Offline in A/B testing is a combination (AVG) of methods with best offlineresults for each RBT.