shanda innovations
DESCRIPTION
Context-aware Ensemble of Multifaceted Factorization Models for Recommendation. Kevin Y. W. Chen. Shanda Innovations. Performance. 0.43959 (public score)/ 0.41874 (private score) 2 nd place Honorable Mention . New Challenges. Richer features in the social networks - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/1.jpg)
Shanda Innovations
Context-aware Ensemble of Multifaceted Factorization
Models for Recommendation
Kevin Y. W. Chen
![Page 2: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/2.jpg)
Performance• 0.43959(public score)/0.41874(private
score)• 2nd place Honorable Mention
![Page 3: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/3.jpg)
New Challenges• Richer features in the social networks
– follower/followee, actions• Items are complicate
– items are specific users• Cold-start problem
– 77.1% users do not have training records
• Training data is quite noisy – ratio of negative samples is 92.82%
![Page 4: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/4.jpg)
Outline• Preprocessing
– denoise– supplement
• Pairwise Training– Max-margin optimization problem
• Multifaceted Factorization Models– Extend the SVD++
• Context-aware Ensemble – Logistic Regression
![Page 5: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/5.jpg)
Preprocess: Session analysis• Negative : Positive = 92 : 8 ?
– not all the negative ratings imply that the users rejected to follow the recommended items
• Eliminating these “omitted” records is necessary– These negative samples can not indicate
users' interests
![Page 6: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/6.jpg)
Preprocess: Session analysis• Session slicing according to the time
interval
• Select the right samples from the right session:
![Page 7: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/7.jpg)
Preprocess: Session analysis• Training dataset after preprocessing
– Negative: 67,955,449 -> 7,594,443 (11.2%)
– Positive: 5,253,828 ->4,999,118• Benefits
– improve precision (0.0037)– reduce computational complexity
![Page 8: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/8.jpg)
Pairwise-training• MAP
– pairwise ranking job• Training pair
– (u, i) and (u, j)• Objective function
![Page 9: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/9.jpg)
Preprocess: Supply positive samples• Lack of positive samples
– An ideal pairwise training requires a good balance between the number of negative and positive samples
• Choose the users– users who have a far smaller number of
positive samples than negative samples• Generate the positive samples
– Figure out from social graphs
![Page 10: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/10.jpg)
The procedure of data preprocessing
![Page 11: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/11.jpg)
Multifaceted Factorization Models• Latent Factor Model
– stochastic gradient descent • MFM extends the SVD++
– integrate all kinds of valuable features in social networks
![Page 12: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/12.jpg)
MFM: Demographic features• User and item profiles
– age(u), age(i)– gender(u), gender(i)– tweetnum(u)
• Combinations– uid*gender(i)– uid*age(i)– gender(u)*iid– age(u)*iid
![Page 13: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/13.jpg)
MFM: Integrate Social Relationships• Influence of social relations• Cold start:
– 77.1% users do not have any rating records in the training set
• User feature vector:– Incorporate SNS relations and actions
• Bring significant improvement– MAP: 0.3495 ->0.3688 ->0.3701
![Page 14: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/14.jpg)
MFM: Utilizing Keywords and Tags• Share common interests
– explicit feedbacks
• User feature vectors:
![Page 15: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/15.jpg)
MFM: Date-Time Dependent Biases• Users' action differs when time changes
• The popularities of items change over time
![Page 16: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/16.jpg)
k-Nearest Neighbors• Similar to SVD++• Find the neighbors
– calculate the distance based on Keywords and tags
• Intersection of explicit and implicit feedbacks
![Page 17: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/17.jpg)
Ensemble• When will the user follow an item?
– pay attention to the item– be interested in the item
• User behavior modeling– predict whether the user noticed the
recommendation area at that time• User interest modeling
– a item meet the user’s tastes -- MFM
![Page 18: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/18.jpg)
User Behavior Modeling• Durations of users on each
recommendation are very valuable clues
• Context of durations
![Page 19: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/19.jpg)
Experiment
![Page 20: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/20.jpg)
Experiment
![Page 21: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/21.jpg)
The framework
![Page 22: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/22.jpg)
Summary• A proper data preprocessing is necessary• Pairwise training (top-N recommendation)• Social relations and actions can be used as
implicit feedbacks• Integrate all kinds of valuable features• Users' interests and users' behaviors are
both need to be considered
![Page 23: Shanda Innovations](https://reader036.vdocuments.site/reader036/viewer/2022062501/56816711550346895ddb7ae0/html5/thumbnails/23.jpg)
Shanda Innovations Team
Yunwen Chen, Zuotao Liu, Daqi Ji, Yingwei Xin, Wenguang Wang, Lu Yao, Yi Zou