pog:personalizedoutfitgenerationfor ... · pog: personalized outfit generation for fashion...

1
POG : Personalized Outfit Generation for Fashion Recommendation at Alibaba iFashion Wen Chen, Pipei Huang, Jiaming Xu, Xin Guo, Cheng Guo, Fei Sun, Chao Li, Andreas Pfadler, Huan Zhao, Binqiang Zhao Alibaba Group, Beijing, China. (Applied Data Science Track, Poster No.5, August 6) Introduction There exist two requirements for fashion out- fit recommendation: the Compatibility of the generated fashion outfits, and the Per- sonalization in the recommendation pro- cess. We attempt to build the bridge between fash- ion outfit generation and recommendation. For the Compatibility requirement, we pro- pose a Fashion Outfit Model (FOM) by set- ting up a masked item prediction task based on the self-attention mechanism. For the Personalization requirement, we propose a Personalized Outfit Generation (POG) model, which uses a Transformer architec- ture to model both signals from user pref- erence and outfit compatibility. This is the first study to generate personalized outfits based on users’ historical behaviors with an encoder-decoder framework in an industrial- scale application at Alibaba iFashion (Fig- ure 1). Figure 1:Alibaba iFashion. Dataset We provide three datasets describing the outfits, the items, and the user behaviors separately. To the best of our knowledge, our dataset is the largest publicly available dataset of fashion items with rich informa- tion compared to existing datasets. Table 1:Statistics of the datasets. Dataset #Outfits #Users #Items Outfit data 1,013,136 - 583,464 Item data - - 4,747,039 User data 127,169 3,569,112 4,463,302 FOM Masking items one at a time in the outfits, we require the model to fill in the blank with the correct item according to the context. Since every item in the outfit is masked to fuse its left and right context, the compatibility between each item and all the other items within the outfit can be learned from the self- attention mechanism. Given an outfit F = {f 1 ,...,f t ,...,f n }, where f t is the t-th item, we use a particu- lar embedding [MASK] for the masked item. Non-masked items are represented by their multi-modal embeddings. We then represent the set of input embeddings as F mask . POG Take the advantage of encoder-decoder structure, we aim to translate an user’s his- torical behaviors to a personalized outfit. Let U denote the set of all users and F be the set of all outfits. We use a sequence of user behaviors U = {u 1 ,...,u i ,...,u m } to characterize an user, where u i are the clicked items by the user. F = {f 1 ,...,f t ,...,f n } is the clicked outfit from the same user, where f t are the items in the outfit. At each time step, we predict the next outfit item given previous outfit items and user’s click sequence on items U . Figure 2:The architecture of FOM. We mask the items in the outfit one at a time. Figure 3:The architecture of POG, which is an encoder-decoder architecture with a Per network and a Gen network. We minimize the following loss function: L F = - 1 n n X mask =1 log Pr (f mask |F mask ; Θ F ) where Θ F denotes the model parameters, and Pr (·) is the probability of choosing the correct item conditioned on the non-masked items. The model architecture is shown in Figure 2. For pair (U, F ), the objective function of POG can be written as: L (U,F ) = - 1 n n X t=1 log Pr f t+1 |f 1 ,...,f t ,U ; Θ (U,F ) where Θ (U,F ) denotes the model parameters. Pr (·) is the probability of seeing f t+1 conditioned on both previous outfit items and user clicked items. The model architecture is shown in Figure 3. Dida Platform We develop a platform named Dida which is able to generate personalized outfits auto- matically. Dida is widely used by more than one million operators at Alibaba. About 6 million personalized outfits are generated ev- eryday with high quality. So far, the outfits have been recommended to more than 5.4 million users. Figure 4:Dida platform. Experiment Our model significantly outperforms other alternative methods through outfit compat- ibility experiments, including pushing the FITB (Fill In The Blanks) benchmark to 68.79% (5.98% relative improvement) and CP (Compatibility Prediction) benchmark to 86.32% (25.81% relative improvement). Through extensive online experiments, we show that POG clearly outperforms the CF method by 70% increase in CTR (Click- Through-Rate) metric. The online cases can be found in Figure 5. Figure 5:The online cases of POG. Contact Information Email: [email protected]

Upload: others

Post on 09-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: POG:PersonalizedOutfitGenerationfor ... · POG: Personalized Outfit Generation for Fashion Recommendation at Alibaba iFashion Author: Wen Chen, Pipei Huang, Jiaming Xu, Xin Guo,

POG: Personalized Outfit Generation forFashion Recommendation at Alibaba iFashion

Wen Chen, Pipei Huang, Jiaming Xu, Xin Guo, Cheng Guo, Fei Sun, Chao Li, Andreas Pfadler, Huan Zhao, Binqiang ZhaoAlibaba Group, Beijing, China. (Applied Data Science Track, Poster No.5, August 6)

Introduction

There exist two requirements for fashion out-fit recommendation: the Compatibility ofthe generated fashion outfits, and the Per-sonalization in the recommendation pro-cess.We attempt to build the bridge between fash-ion outfit generation and recommendation.For the Compatibility requirement, we pro-pose a Fashion Outfit Model (FOM) by set-ting up a masked item prediction task basedon the self-attention mechanism. For thePersonalization requirement, we proposea Personalized Outfit Generation (POG)model, which uses a Transformer architec-ture to model both signals from user pref-erence and outfit compatibility. This is thefirst study to generate personalized outfitsbased on users’ historical behaviors with anencoder-decoder framework in an industrial-scale application at Alibaba iFashion (Fig-ure 1).

Figure 1:Alibaba iFashion.

Dataset

We provide three datasets describing theoutfits, the items, and the user behaviorsseparately. To the best of our knowledge,our dataset is the largest publicly availabledataset of fashion items with rich informa-tion compared to existing datasets.

Table 1:Statistics of the datasets.

Dataset #Outfits #Users #ItemsOutfit data 1,013,136 - 583,464Item data - - 4,747,039User data 127,169 3,569,112 4,463,302

FOM

Masking items one at a time in the outfits, werequire the model to fill in the blank with thecorrect item according to the context. Sinceevery item in the outfit is masked to fuseits left and right context, the compatibilitybetween each item and all the other itemswithin the outfit can be learned from the self-attention mechanism.Given an outfit F = {f1, . . . , ft, . . . , fn},where ft is the t-th item, we use a particu-lar embedding [MASK] for the masked item.Non-masked items are represented by theirmulti-modal embeddings. We then representthe set of input embeddings as Fmask.

POG

Take the advantage of encoder-decoderstructure, we aim to translate an user’s his-torical behaviors to a personalized outfit.Let U denote the set of all users and F bethe set of all outfits. We use a sequence ofuser behaviors U = {u1, . . . , ui, . . . , um} tocharacterize an user, where ui are the clickeditems by the user. F = {f1, . . . , ft, . . . , fn}is the clicked outfit from the same user,where ft are the items in the outfit. At eachtime step, we predict the next outfit itemgiven previous outfit items and user’s clicksequence on items U .

Figure 2:The architecture of FOM. We mask the items in the outfit one at a time.

Figure 3:The architecture of POG, which is an encoder-decoder architecture with a Per network and a Gen network.

We minimize the following loss function:

LF = −1n

n∑mask=1

log Pr(fmask|Fmask; ΘF )

where ΘF denotes the model parameters,and Pr(·) is the probability of choosing thecorrect item conditioned on the non-maskeditems. The model architecture is shown inFigure 2.

For pair (U, F ), the objective function ofPOG can be written as:L(U,F ) = −1

n

n∑t=1

log Pr(ft+1|f1, . . . , ft, U ; Θ(U,F )

)

where Θ(U,F ) denotes the model parameters.Pr(·) is the probability of seeing ft+1conditioned on both previous outfit itemsand user clicked items. The modelarchitecture is shown in Figure 3.

Dida Platform

We develop a platform named Dida whichis able to generate personalized outfits auto-matically. Dida is widely used by more thanone million operators at Alibaba. About 6million personalized outfits are generated ev-eryday with high quality. So far, the outfitshave been recommended to more than 5.4million users.

Figure 4:Dida platform.

Experiment

Our model significantly outperforms otheralternative methods through outfit compat-ibility experiments, including pushing theFITB (Fill In The Blanks) benchmark to68.79% (5.98% relative improvement) andCP (Compatibility Prediction) benchmarkto 86.32% (25.81% relative improvement).Through extensive online experiments, weshow that POG clearly outperforms the CFmethod by 70% increase in CTR (Click-Through-Rate) metric. The online cases canbe found in Figure 5.

Figure 5:The online cases of POG.

Contact Information

•Email: [email protected]