論文輪読（effective lstms for target-dependent sentiment classification, duyu tang et al.,...

Effective LSTMs for Target-Dependent Sentiment

Classification

Duyu Tang, Bing Qin, Xiaocheng Feng, and Ting Liu

@COLING2016

読み手：栗原理聡1

この論文の概要

・ Target-dependent sentiment classification に取り組んだ．

・ target word と context word との意味的関係を捉えられる　モデルとして target dependent な LSTM モデルを構築した．

・このモデルで（ほぼ） SoTA を達成した．（主張： syntactic parser とか sentiment lexicon を使わずに！）

abstract

2

Target-dependent sentiment classificationIntro

“I bought a new camera. The picture quality is amazing but the battery life is too short”

Input ： a sentence, target mentionOutput ： the sentiment polarity (e.g. positive, negative, neutral) 　　　　 of the sentence towards the target

target = picture quality Output = positive

target = battery life Output = negative

target word とその context words との意味的関係をうまくモデル化したい

target-dependent な LSTM モデルの提案

（ただ feature engineering は labor intensive だからしたくない）

3

basic LSTMApproach

このモデルでは同一文内に複数のtarget が存在したとしても，どの target に対しても同じ極性でしか推定できない．

4

target-dependent LSTMApproach

target words の前後でそれぞれ別の LSTM を構築し、それぞれの最終隠れ層の出力を concat して softmax をかけて極性を推定

こうすることで target 依存の文脈情報を捉えられるようになり，精度向上が期待出来る

5

target-connection LSTMApproach

： target words の各 word のベクトルの（相加）平均

をそれぞれの入力単語ベクトルに連結したベクトルを入力としてtarget-dependent LSTM と同様の流れで学習

こうすることでさらに target 依存の文脈情報を捉えられるようになり，精度向上が期待出来る

6

実験設定Experiment

データセット [Dong et al., 2014]　・製品や有名人などを含むツイートを抽出　・人手で (negative, neutral, positive) を分類　・アノテーター２人（一致率 : 82.5% ）　・ negative : neutral : positive = 25% : 50% : 25% 　・ train : 6248 tweet, test : 692 tweet

比較手法　・ SVM-indep, SVM-dep[Jiang et al., 2011]　・ Recursive NN, AdaRNN[Dong et al., 2014]　・ Target-dep[Vo and Zhang, 2015]

LSTM　・ 100 次元のベクトルを Twitter から Glove で学習　・一様分布 U(-0.003, 0.003) でパラメータをランダムに初期化　・ softmax 層の clipping 閾値： 200　・ learning rate ： 0.01 7

比較手法概要Experiment

[Jiang et al., 2011]

　・ SVM-indep ： unigram, bigram, 記号 , 顔文字 , ハッシュタグ , ポジネガ単語数（辞書を使用）

　・ SVM-dep ： SVM-indep の素性 + target-dependent な素性（ target を項に持つ述語など）

8


[Dong et al., 2014]

・ AdaRNN-w/oE ：合成関数の選択に係り受け関係を使わない・ AdaRNN-w/E ：合成関数の選択に係り受け関係を使う・ AdaRNN-comb ： AdaRNN-w/E の合成ベクトル + uni/bigram を素性として SVM

9


[Vo and Zhang, 2015]

・ Target-dep ： T_tw と F_tw の素性のみで SVM・ Target-dep+ ： T_tw と F_tw に加えて S_tw の素性も含めて SVM

10

実験結果（既存手法との比較）Experiment

11

実験結果（既存手法との比較）Experiment

target の情報を加味することで精度 UP

Sentiment lexicon の feature を加えなくても SoTA と同等の性能

12

実験結果（ word embeddings の効果）Experiment

13


This shows the importance of context information for word embedding learning as both SSWEh and SSWEr do not encode any word contexts.

14


Glove and SSWEu perform comparably, which indicates the importance of global context for estimating a good word representation.

15

実験結果（ Glove 間での次元による比較）Experiment

LSTM-TC の方が LSTM-TD より精度がわずかに向上するが，計算コストが大きくかかる．

16

Attention-based LSTM モデルDiscussion

we also tried an attention-based LSTM model, which is inspired by the recent success of attention-based neural network in machine translation (Bahdanau et al., 2015) and document encoding (Li et al., 2015b). We implement the soft-attention mechanism (Bah- danau et al., 2015) to enhance TD-LSTM. We incorporate two attention layers for preceding LSTM and following LSTM, respectively. The output vector for each attention layer is the weighted average among hidden vectors of LSTM, where the weight of each hidden vector is calculated with a feedforward neural network. The outputs of preceding and following attention models are concatenated and fed to sof tmax for sentiment classification. However, we cannot obtain better result with such an attention model. The accuracy of this attention model is slightly lower than the standard LSTM model (around 65%), which means that the attention component has a negative impact on the model. A potential reason might be that the attention based LSTM has larger number of parameters, which cannot be easily optimized with the small number of corpus.

17

References

18

Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, and Ke Xu. 2014. Adaptive recursive neural network for target-dependent twitter sentiment classification. In ACL, pages 49–54.

Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. 2011. Target-dependent twitter sentiment classification. ACL, 1:151–160.

Duy-Tin Vo and Yue Zhang. 2015.Target-dependent twitter sentiment classification with rich automatic features.IJCAI.

論文輪読（effective lstms for target-dependent sentiment classification, duyu tang et al.,...

Data & Analytics