enhancing matrix factorization through initialization for implicit feedback databases

ENHANCING MATRIX FACTORIZATION

THROUGH INITIALIZATION FOR

IMPLICIT FEEDBACK DATABASES

Balázs Hidasi

Domonkos Tikk

Gravity R&D Ltd.

Budapest University of Technology and Economics

OUTLINE

Matrix factoriztion

Initialization concept

Methods

Naive

SimFactor

Results

Discussion

2/19

MATRIX FACTORIZATION

Collaborative Filtering

One of the most common approaches

Approximates the rating matrix as product of low-

rank matrices

R

Q

P

Items

Use

rs

≈

3/19

MATRIX FACTORIZATION

Initialize P and Q with small random numbers

Teach P and Q

Alternating Least Squares

Gradient Descent

Etc.

Transforms the data to a feature space

Separately for users and items

Noise reduction

Compression

Generalization

4/19

IMPLICIT FEEDBACK

No ratings

User-item interactions (events)

Much noisier

Presence of an event might not be positive feedback

Absence of an event does not mean negative

feedback

No negative feedback is available!

More common problem

MF for implicit feedback

Less accurate results due to noise

Mostly ALS is used

Scalability problems (rating matrix is dense)

5/19

CONCEPT

Good MF model

The feature vectors of similar entities are similar

If data is too noisy similar entities won’t be similar by

their features

Start MF from a „good” point

Feature vector similarities are OK

Data is more than just events

Metadata

Info about items/users

Contextual data

In what context did the event occured

Can we incorporate those to help implicit MF? 6/19

NAIVE APPROACH

Describe items using any data we have (detailed

later)

Long, sparse vectors for item description

Compress these vectors to dense feature vectors

PCA, MLP, MF, …

Length of desired vectors = Number of features in MF

Use these features as starting points

7/19

NAIVE APPROACH

Compression and also noise reduction

Does not really care about similarities

But often feature similarities are not that bad

If MF is used

Half of the results is thrown out

Description of items

Descriptor features

Ite

m

featu

res

Descriptors

Ite

ms

≈

8/19

SIMFACTOR ALGORITHM

Try to preserve similarities better

Starting from an MF of item description

Similarities of items: DD’

Some metrics require transformation on D


(D)

Descriptors features

Ite

m

featu

res

Descriptors

Ite

ms


(D)

Descriptio

n o

f item

s

(D’)

Item

similarities

(S) =

≈

9/19

SIMFACTOR ALGORITHM

Similarity approximation

Y’Y KxK symmetric

Eigendecomposition

Item

similarities

(S)

Descriptors features

(Y’) Item

featu

res

(X)

Descrip

tors

featu

res

(Y)

Item

features

(X’)

Item

similarities

(S)

Item

featu

res

(X)

Item

features

(X’) Y’Y

≈

≈

10/19

SIMFACTOR ALGORITHM

λ diagonal λ = SQRT(λ) * SQRT(λ)

X*U*SQRT(λ) = (SQRT(λ)*U’*X’)’=F

F is MxK matrix

S F * F’ F used for initialization

Y’Y

U SQRT

(λ) U’

Item

similarities

(S)

Item

featu

res

(X)

Item

features

(X’)

SQRT

(λ)

Item

similarities

(S)

F

F’

≈

≈

U U’ λ =

11/19

CREATING THE DESCRIPTION MATRIX

„Any” data about the entity

Vector-space reprezentation

For Items:

Metadata vector (title, category, description, etc)

Event vector (who bought the item)

Context-state vector (in which context state was it

bought)

Context-event (in which context state who bought it)

For Users:

All above except metadata

Currently: Choose one source for D matrix

Context used: seasonality 12/19

EXPERIMENTS: SIMILARITY PRESERVATION

Real life dataset: online grocery shopping events

SimFactor approximates similarities better

52,36%

10,81%

16,86%

26,22%

13,39%

48,70%

12,38%

Item context state User context state Item context-event

User context-event

Item event data User event data Item metadata

SimFactor RMSE improvement over naive in similarity approximation

13/19

EXPERIMENTS: INITIALIZATION

Using different description matrices

And both naive and SimFactor initialization

Baseline: random init

Evaluation metric: recall@50

14/19

EXPERIMENTS: GROCERY DB

Up to 6% improvement

Best methods use SimFactor and user context data

5,71%

4,88%

4,30% 4,12% 4,04%

User context state (SimFactor)

User context state (Naive)

User context event (SimFactor)

User event data (SimFactor)

User context event (Naive)

Top5 methods on Grocery DB

15/19

EXPERIMENTS: „IMPLICITIZED” MOVIELENS

Keeping 5 star ratings implicit events

Up to 10% improvement

Best methods use SimFactor and item context data

10%

9,17% 9,17% 9,17% 9,17%

Item context state (SimFactor)

User context state (SimFactor)

Item context event (SimFactor)

Item context event (Naive)

Item context state (Naive)

Top5 methods on MovieLens DB

16/19

DISCUSSION OF RESULTS

SimFactor yields better results than naive

Context information yields better results than other

descriptions

Context information separates well between entities

Grocery: User context

People’s routines

Different types of shoppings in different times

MovieLens: Item context

Different types of movies watched on different hours

Context-based similarity

17/19

WHY CONTEXT?

Grocery example

Correlation between context states by users low

Why can context aware algorithms be efficient?

Different recommendations in different context states

Context differentiates well between entities

Easier subtasks

ITEM

Mon Tue Wed Thu Fri Sat Sun

Mon 1,00 0,79 0,79 0,78 0,76 0,70 0,74

Tue 0,79 1,00 0,79 0,78 0,76 0,69 0,73

Wed 0,79 0,79 1,00 0,79 0,76 0,70 0,74

Thu 0,78 0,78 0,79 1,00 0,76 0,71 0,74

Fri 0,76 0,76 0,76 0,76 1,00 0,71 0,72

Sat 0,70 0,69 0,70 0,71 0,71 1,00 0,71

Sun 0,74 0,73 0,74 0,74 0,72 0,71 1,00

USER

Mon Tue Wed Thu Fri Sat Sun

Mon 1,00 0,36 0,34 0,34 0,35 0,29 0,19

Tue 0,36 1,00 0,34 0,34 0,33 0,29 0,19

Wed 0,34 0,34 1,00 0,36 0,35 0,27 0,17

Thu 0,34 0,34 0,36 1,00 0,39 0,30 0,16

Fri 0,35 0,33 0,35 0,39 1,00 0,32 0,16

Sat 0,29 0,29 0,27 0,30 0,32 1,00 0,33

Sun 0,19 0,19 0,17 0,16 0,16 0,33 1,00

18/19

CONCLUSION & FUTURE WORK

SimFactor Similarity preserving compression

Similarity based MF initialization:

Description matrix from any data

Apply SimFactor

Use output as initial features for MF

Context differentitates between entities well

Future work:

Mixed description matrix (multiple data sources)

Multiple description matrix

Using different context information

Using different similarity metrics 19/19

THANKS FOR YOUR ATTENTION!

Questions?

enhancing matrix factorization through initialization for implicit feedback databases

Technology

description matrix

implicit mf

item contextstate vector

number of features

event vector

d matrix

similar bytheir features

mxk matrix s f