jointly learning word representations and composition functions using …hassy/publications/... ·...

Post on 03-May-2020

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

10/28/2014 EMNLP 2014 in Doha, Qatar

Jointly LearningWord Representations and Composition Functions

Using Predicate-Argument Structures

Kazuma Hashimoto (UT)

Pontus Stenetorp (UT)

Makoto Miwa (TTI)

Yoshimasa Tsuruoka (UT)

University of Tokyo (UT)Toyota Technological Institute (TTI)

• Neural networks + large unlabeled corpora

Neural Word Vector Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

• Neural networks + large unlabeled corpora

– Learn word (i.e. single token) representations

• e.g.) word2vec

(Mikolov+ 2013; Mnih and Kavukcuoglu 2013; inter alia)

Neural Word Vector Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

• Neural networks + large unlabeled corpora

– Learn word (i.e. single token) representations

• e.g.) word2vec

(Mikolov+ 2013; Mnih and Kavukcuoglu 2013; inter alia)

– Learn composed vector representations

• e.g.) compositional neural language models

for verb-object vectors (Tsubaki+ 2013)

Neural Word Vector Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec

Compositional

neural

language models

Our model

single token

representations ✓ ✓ ✓recursive structures

of syntactic relations x x ✓pre-training

✓ x ✓composition

x ✓ ✓

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec

Compositional

neural

language models

Our model

single token

representations ✓ ✓ ✓recursive structures

of syntactic relations x x ✓pre-training

✓ x ✓composition

x ✓ ✓

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec

Compositional

neural

language models

Our model

single token

representations ✓ ✓ ✓recursive structures

of syntactic relations x x ✓pre-training

✓ x ✓composition

x ✓ ✓

Relation to Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

word2vec

Compositional

neural

language models

Our model

single token

representations ✓ ✓ ✓recursive structures

of syntactic relations x x ✓pre-training

✓ x ✓composition

x ✓ ✓

• Learning word and composed representations

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

• Learning word and composed representations

– using syntactic structures of unlabeled corpora

d vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

• Learning word and composed representations

– using syntactic structures of unlabeled corpora

– without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

• Learning word and composed representations

– using syntactic structures of unlabeled corpora

– without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

storm

downpourpay

solve

overcome

• Learning word and composed representations

– using syntactic structures of unlabeled corpora

– without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

storm

downpour

heavy rainmake payment

paysolve problem

achieve objective

bridge gap

solve

overcome

• Learning word and composed representations

– using syntactic structures of unlabeled corpora

– without pre-trained word vectors

A Joint Learning Model

10/28/2014 EMNLP 2014 in Doha, Qatar

storm

downpour

heavy rainmake payment

paysolve problem

achieve objective

bridge gap

solve

overcome

State-of-the-art scores

for phrase similarity tasks with transitive verbs

1. Learning word representations

using predicate-argument structures

2. Jointly learning word representations and

composition functions

3. Evaluation on phrase similarity tasks

4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

1. Learning word representations

using predicate-argument structures

2. Jointly learning word representations and

composition functions

3. Evaluation on phrase similarity tasks

4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

• Standard dependency structures

– Relations between heads and modifiers

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

the heavy rain caused the car accidents

• Standard dependency structures

– Relations between heads and modifiers

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

the heavy rain caused the car accidents

nndet

det

amod

nsubj dobj

root

• Standard dependency structures

– Relations between heads and modifiers

• Predicate-Argument Structures (PASs)

– Relations between predicates and arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

the heavy rain caused the car accidents

nndet

det

amod

nsubj dobj

root

the heavy rain caused the car accidents

• Each predicate in a sentence has

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents

• Each predicate in a sentence has

– a specific category

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents

• Each predicate in a sentence has

– a specific category

– zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidents

• Each predicate in a sentence has

– a specific category

– zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidentsadjective

argument 1

• Each predicate in a sentence has

– a specific category

– zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidentsverbadjective

argument 1

argument 1 argument 2

• Each predicate in a sentence has

– a specific category

– zero or more arguments

Predicate-Argument Structures (PASs)

10/28/2014 EMNLP 2014 in Doha, Qatar

(Enju parser (Miyao and Tsujii 2008))

the heavy rain caused the car accidentsverbadjective noun

argument 1 argument 1

argument 1 argument 2

• Given a PAS, discriminating between

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

• Given a PAS, discriminating between

– a word in the specific PAS

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

• Given a PAS, discriminating between

– a word in the specific PAS and

– a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

• Given a PAS, discriminating between

– a word in the specific PAS and

– a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

rain cause accidentverb

argument 1 argument 2

• Given a PAS, discriminating between

– a word in the specific PAS and

– a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: cause

rain cause accidentverb

argument 1 argument 2

• Given a PAS, discriminating between

– a word in the specific PAS and

– a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: cause

a noise distribution

(scaled unigram distribution

in (Mikolov+, 2013))

rain cause accidentverb

argument 1 argument 2

• Given a PAS, discriminating between

– a word in the specific PAS and

– a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: causevs

a drawn word: eat

a noise distribution

(scaled unigram distribution

in (Mikolov+, 2013))

rain eat accidentverb

argument 1 argument 2

• Given a PAS, discriminating between

– a word in the specific PAS and

– a word drawn from a noise distribution

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

a target word: causevs

a drawn word: eat

a noise distribution

(scaled unigram distribution

in (Mikolov+, 2013))

rain eat accidentverb

argument 1 argument 2

context information

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident word vectors

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+argument

2

word vectors

rain cause accidentverb

argument 1 argument 2

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+argument

2

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

word vectors

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+argument

2

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

word vectors

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+argument

2

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

word vectors

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause

𝑠

argument 2

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

𝑠 = 𝑣 cause ∙ 𝑝(cause)

𝑠′ = 𝑣 eat ∙ 𝑝 cause

word vectors

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

𝑠 = 𝑣 cause ∙ 𝑝(cause)

𝑠′ = 𝑣 eat ∙ 𝑝 cause

word vectors

rain cause accidentverb

argument 1 argument 2

A Word Prediction Model Using PASs

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝐜𝐨𝐬𝐭: 𝐦𝐚𝐱(𝟎, 𝟏 − 𝒔 + 𝒔′)

𝑝 cause =

tanh(ℎ𝑎𝑟𝑔1𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(rain) +

ℎ𝑎𝑟𝑔2𝑣𝑒𝑟𝑏_𝑎𝑟𝑔12

∗ 𝑣(accident))

𝑠 = 𝑣 cause ∙ 𝑝(cause)

𝑠′ = 𝑣 eat ∙ 𝑝 cause

word vectors

rain cause accidentverb

argument 1 argument 2

• Learning word representations based on

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

– selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

– selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

– selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

– selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

– selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

``rain’’ can be

• a subject of ``cause’’

(not ``eat’’)

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Learning word representations based on

– specific PAS categories

– selectional preferences

What We Expect from the Model

10/28/2014 EMNLP 2014 in Doha, Qatar

``rain’’ can be

• a subject of ``cause’’

(not ``eat’’)

• a cause of ``accident’’

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

eat at restaurantpreposition

argument 1 argument 2

heavy rainadjective

argument 1

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 eat 𝑣 a𝑡

argument 1

+predicate

eat at restaurantpreposition

argument 1 argument 2

heavy rainadjective

argument 1

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 eat 𝑣 a𝑡

argument 1

+

restaurant cupboard

𝑠

predicate

𝑠′

eat at restaurantpreposition

argument 1 argument 2

heavy rainadjective

argument 1

Examples

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 eat 𝑣 a𝑡

argument 1

+

restaurant cupboard

𝑠

predicate

𝑠′

𝑣 rain

argument 1

+

heavy delicious

𝑠 𝑠′

eat at restaurantpreposition

argument 1 argument 2

heavy rainadjective

argument 1

• Providing additional context information

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Providing additional context information

– Nouns and Verbs in the same sentences

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident

argument 1

+

cause eat

𝑠

argument 2

𝑠′

• Providing additional context information

– Nouns and Verbs in the same sentences

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident 𝑣 road 𝑣 injure

argument 1

+

cause eat

𝑠

argument 2

𝑠′

+

• Providing additional context information

– Nouns and Verbs in the same sentences

Adding Bag-of-Words Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

𝑣 rain 𝑣 accident 𝑣 road 𝑣 injure

argument 1

+

cause eat

𝑠

argument 2

𝑠′

+BoW

• Learning representations composed by

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

• Learning representations composed by

– multiple words and

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

• Learning representations composed by

– multiple words and

– specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

• Learning representations composed by

– multiple words and

– specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

storm

downpour

• Learning representations composed by

– multiple words and

– specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

storm

downpour

heavy rainadjective

argument 1

• Learning representations composed by

– multiple words and

– specific relation categories

Beyond Single Word Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

storm

downpour

heavy rain

heavy rainadjective

argument 1

• Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 rain 𝑣 accident

rain cause accidentverb

argument 1 argument 2

• Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

rain cause accidentverb

argument 1 argument 2

heavyadjective

argument 1

carnoun

argument 1

𝑣 rain 𝑣 accident

• Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy__rain 𝑣 car__accident

rain cause accidentverb

argument 1 argument 2

heavyadjective

argument 1

carnoun

argument 1

parameterization

• Using connections on graphs of PASs

A Specific PAS as a Single Token

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

Same as Previously!

rain cause accidentverb

argument 1 argument 2

heavyadjective

argument 1

carnoun

argument 1

𝑣 heavy__rain 𝑣 car__accident

parameterization

• Similar tokens for each PAS representation

in terms of cosine similarity

Learned PAS Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

heavy_rain chief_executive world_war

rain

thunderstorm

downpour

blizzard

much_rain

general_manager

vice_president

executive_director

project_manager

managing_director

second_war

plane_crash

riot

last_war

great_war

• Similar tokens for each PAS representation

in terms of cosine similarity

Learned PAS Representations

10/28/2014 EMNLP 2014 in Doha, Qatar

make_payment solve_problem meeting_take_place

make_order

carry_survey

pay_tax

pay

impose_tax

achieve_objective

bridge_gap

improve_quality

deliver_information

encourage_development

hold_meeting

event_take_place

end_season

discussion_take_place

do_work

1. Learning word representations

using predicate-argument structures

2. Jointly learning word representations and

composition functions

3. Evaluation on phrase similarity tasks

4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy__rain 𝑣 car__accident

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

fully parameterized

PAS representations𝑣 heavy__rain 𝑣 car__accident

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

fully parameterized

PAS representations

• Very large number of combinations of words

𝑣 heavy__rain 𝑣 car__accident

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

fully parameterized

PAS representations

• Very large number of combinations of words

Data sparseness

𝑣 heavy__rain 𝑣 car__accident

Why Composition?

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

fully parameterized

PAS representations

• Very large number of combinations of words

Data sparseness

• Ignoring information from individual words

𝑣 heavy__rain 𝑣 car__accident

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy rain 𝑣 car accident

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy rain 𝑣 car accident

𝑣 heavy 𝑣 rain 𝑣 car 𝑣 accident word vectors

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy rain 𝑣 car accident

𝑣 heavy 𝑣 rain 𝑣 car 𝑣 accident

𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏 𝒈𝒏𝒐𝒖𝒏_𝒂𝒓𝒈𝟏composition functions

word vectors

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy rain 𝑣 car accident

𝑣 heavy 𝑣 rain 𝑣 car 𝑣 accident

𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏 𝒈𝒏𝒐𝒖𝒏_𝒂𝒓𝒈𝟏composition functions

composed vectors

word vectors

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy rain 𝑣 car accident

𝑣 heavy 𝑣 rain 𝑣 car 𝑣 accident

𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏 𝒈𝒏𝒐𝒖𝒏_𝒂𝒓𝒈𝟏composition functions

composed vectors

Same as Previously!

word vectors

Incorporating Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

argument 1

+

cause eat

𝑠

argument 2

𝑠′

𝑣 heavy 𝑣 rain 𝑣 car 𝑣 accident

𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏 𝒈𝒏𝒐𝒖𝒏_𝒂𝒓𝒈𝟏composition functions

𝑣 heavy rain 𝑣 car accident

• Simple element-wise composition functions

with and without tanh

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

• Simple element-wise composition functions

with and without tanh

– e.g.)

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

Composition Function 𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏

𝑣 heavy rain = 𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏(𝑣 heavy , 𝑣 rain )

• Simple element-wise composition functions

with and without tanh

– e.g.)

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

Composition Function 𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏

Add𝑙 𝑣 heavy + 𝑣 rain

Add𝑛𝑙 tanh(𝑣 heavy + 𝑣 rain )

𝑣 heavy rain = 𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏(𝑣 heavy , 𝑣 rain )

• Simple element-wise composition functions

with and without tanh

– e.g.)

Composition Functions in this Work

10/28/2014 EMNLP 2014 in Doha, Qatar

Composition Function 𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏

Add𝑙 𝑣 heavy + 𝑣 rain

Add𝑛𝑙 tanh(𝑣 heavy + 𝑣 rain )

WAdd𝑙 𝑚𝑝𝑟𝑒𝑑𝑎𝑑𝑗_𝑎𝑟𝑔1

∗ 𝑣 heavy +𝑚𝑎𝑟𝑔1𝑎𝑑𝑗_𝑎𝑟𝑔1

∗ 𝑣 rain

WAdd𝑛𝑙 tanh(𝑚𝑝𝑟𝑒𝑑𝑎𝑑𝑗_𝑎𝑟𝑔1

∗ 𝑣 heavy + 𝑚𝑎𝑟𝑔1𝑎𝑑𝑗_𝑎𝑟𝑔1

∗ 𝑣 rain )

𝑣 heavy rain = 𝒈𝒂𝒅𝒋_𝒂𝒓𝒈𝟏(𝑣 heavy , 𝑣 rain )

Learned Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

make payment solve problem run company

make repayment

make money

make indemnity

make saving

make sum

solve dilemma

solve task

solve difficulty

solve trouble

solve contradiction

run firm

run industry

run corporation

run enterprise

run club

• Similar composed representations in terms of

cosine similarity

Learned Composed Vectors

10/28/2014 EMNLP 2014 in Doha, Qatar

people kill animal animal kill people meeting take place

anyone kill animal

man kill animal

person kill animal

people kill bird

predator kill animal

creature kill people

effusion kill people

elephant kill people

tiger kill people

people kill people

briefing take place

party take place

session take place

conference take place

investiture take place

• Similar composed representations in terms of

cosine similarity

• L2-norms of the weight vectors of WAdd𝑛𝑙

Learned Composition Weights

10/28/2014 EMNLP 2014 in Doha, Qatar

Category Predicate Argument 1 Argument 2

adj_arg1 2.38 6.55 -

noun_arg1 3.37 5.60 -

verb_arg12 6.78 2.57 2.18

• L2-norms of the weight vectors of WAdd𝑛𝑙

– Clearly emphasizing head words

Learned Composition Weights

10/28/2014 EMNLP 2014 in Doha, Qatar

Category Predicate Argument 1 Argument 2

adj_arg1 2.38 6.55 -

noun_arg1 3.37 5.60 -

verb_arg12 6.78 2.57 2.18

nouns

verbs

1. Learning word representations

using predicate-argument structures

2. Jointly learning word representations and

composition functions

3. Evaluation on phrase similarity tasks

4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

• Training data

– PASs from BNC (~6 million sentences)

• adjective-noun, noun-noun

• prepositions and verbs with 2 arguments

Experimental Settings

10/28/2014 EMNLP 2014 in Doha, Qatar

• Training data

– PASs from BNC (~6 million sentences)

• adjective-noun, noun-noun

• prepositions and verbs with 2 arguments

• Dimensionality

– 50 and 1,000

Experimental Settings

10/28/2014 EMNLP 2014 in Doha, Qatar

• Training data

– PASs from BNC (~6 million sentences)

• adjective-noun, noun-noun

• prepositions and verbs with 2 arguments

• Dimensionality

– 50 and 1,000

• Optimization

– AdaGrad (Duchi+ 2011)

• learning rate: 0.05, mini-batch size: 32

Experimental Settings

10/28/2014 EMNLP 2014 in Doha, Qatar

• Measuring the semantic similarity between

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

• Measuring the semantic similarity between

– Adjective-Noun phrases (AN)

– Noun-Noun phrases (NN)

– Verb-Object phrases (VO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

• Measuring the semantic similarity between

– Adjective-Noun phrases (AN)

– Noun-Noun phrases (NN)

– Verb-Object phrases (VO)

– Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

(Grefenstette and Sadrzadeh 2011)

• Measuring the semantic similarity between

– Adjective-Noun phrases (AN)

– Noun-Noun phrases (NN)

– Verb-Object phrases (VO)

– Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

(Grefenstette and Sadrzadeh 2011)

p1: vast amount

p2: large quantity

AN dataset

• Measuring the semantic similarity between

– Adjective-Noun phrases (AN)

– Noun-Noun phrases (NN)

– Verb-Object phrases (VO)

– Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

(Grefenstette and Sadrzadeh 2011)

p1: vast amount

p2: large quantity

AN dataset

human

annotatorsimilarity score

7

• Measuring the semantic similarity between

– Adjective-Noun phrases (AN)

– Noun-Noun phrases (NN)

– Verb-Object phrases (VO)

– Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

(Grefenstette and Sadrzadeh 2011)

p1: vast amount

p2: large quantity

AN dataset

human

annotator

cos 𝑣 𝑝1 , 𝑣 𝑝2 = 0.85

similarity score

7

• Measuring the semantic similarity between

– Adjective-Noun phrases (AN)

– Noun-Noun phrases (NN)

– Verb-Object phrases (VO)

– Subject-Verb-Object phrases (SVO)

Datasets for Evaluation

10/28/2014 EMNLP 2014 in Doha, Qatar

(Mitchell and Lapata 2010)

(Grefenstette and Sadrzadeh 2011)

p1: vast amount

p2: large quantity

AN dataset

human

annotator

Spearman’s rank correlation

cos 𝑣 𝑝1 , 𝑣 𝑝2 = 0.85

similarity score

7

• Examples of phrase pairs for noun phrase tasks

Examples of Phrase Pairs

10/28/2014 EMNLP 2014 in Doha, Qatar

AN

phrase pair score

vast amount

large quantity7

important part

significant role7

efficient use

little room1

early stage

dark eye1

NN

phrase pair score

wage increase

tax rate7

education course

training programme6

office worker

kitchen door2

study group

news agency1

• Examples of phrase pairs for verb phrase tasks

Examples of Phrase Pairs

10/28/2014 EMNLP 2014 in Doha, Qatar

VO

phrase pair score

start work

begin career7

pour tea

drink water6

shut door

close eye1

wave hand

start work1

SVO

phrase pair score

student write name

student spell name7

child show sign

child express sign6

river meet sea

river visit sea1

system meet criterion

system visit criterion1

• Strong baselines produced by word2vec

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO SVO

Corr

ela

tion S

core

Add_l Add_nl Wadd_l Wadd_nl word2vec Human

• Strong baselines produced by word2vec

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO SVO

Corr

ela

tion S

core

Add_l Add_nl Wadd_l Wadd_nl word2vec Human

• Strong baselines produced by word2vec

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO SVO

Corr

ela

tion S

core

Add_l Add_nl Wadd_l Wadd_nl word2vec Human

• Strong baselines produced by word2vec

• Nice scores for verb phrase tasks

Main Results (50dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO SVO

Corr

ela

tion S

core

Add_l Add_nl Wadd_l Wadd_nl word2vec Human

• Nice scores for verb phrase tasks

• Consistently outperforming 50 dimensional vectors

Main Results (1,000 dim)

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO SVO

Corr

ela

tion S

core

Add_l Add_nl Wadd_l Wadd_nl word2vec Human

• The AN, NN, and VO tasks

– BL: element-wise multiplications

(Blacoe and Lapata 2012)

– HB: recursive neural networks with CCGs

(Hermann and Blunsom 2013)

– KS: tensor-based composition models

(Kartsaklis and Sadrzadeh 2013)

• The SVO task

– GS, VC: tensor-based composition models

(Grefenstette and Sadrzadeh 2011), (Van de Cruys+ 2013)

Comparison with Previous Work

10/28/2014 EMNLP 2014 in Doha, Qatar

The AN, NN, and VO Tasks

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO

Corr

ela

tion S

core

Add_nl Wadd_nl BL HB KS Human

• 50 dim

– Comparable to state-of-the-art scores

The AN, NN, and VO Tasks

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO

Corr

ela

tion S

core

Add_nl Wadd_nl BL HB KS Human

• 1,000 dim

– New state-of-the-art score for the VO task

The AN, NN, and VO Tasks

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO

Corr

ela

tion S

core

Add_nl Wadd_nl BL HB KS Human

The SVO Task

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

SVO

Corr

ela

tion S

core

Wadd_nl GS VC Human

BNC ukWaC

• State-of-the-art models use large corpora

– e.g.) ukWaC corpus (~ 2B words)

• Achieving the state-of-the-art score using

a much smaller corpus

– BNC (~ 0.1B words) vs ukWaC (~ 2B words)

The SVO Task

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

SVO

Corr

ela

tion S

core

Wadd_nl GS VC Human

BNC BNC ukWaC

• BoW contexts are helpful for the verb phrase tasks

– The results might be dependent on how to

construct BoW contexts

Effects of BoW Contexts

10/28/2014 EMNLP 2014 in Doha, Qatar

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

AN NN VO SVO

Corr

ela

tion S

core

Wadd_nl w/o BoW Wadd_nl w/ BoW Human

1. Learning word representations

using predicate-argument structures

2. Jointly learning word representations and

composition functions

3. Evaluation on phrase similarity tasks

4. Conclusion

Overview

10/28/2014 EMNLP 2014 in Doha, Qatar

• Jointly learning composition functions

– with syntactic structures

– without any pre-trained word vectors

• State-of-the-art scores for verb phrase similarity

tasks

Conclusion

10/28/2014 EMNLP 2014 in Doha, Qatar

• Incorporating more sophisticated composition

functions to improve verb phrase representations

• Learning full phrase representations rather than

only 2 or 3 word phrases

Future Work

10/28/2014 EMNLP 2014 in Doha, Qatar

• Any questions?

Thank You Very Much!

10/28/2014 EMNLP 2014 in Doha, Qatar

top related