statistical association and multiword expressions (firth ... · white dwarf apple juice skeleton...

19
Statistical Association and Multiword Expressions Stefan Evert Institute of Cognitive Science University of Osnabrück [email protected] | purl.org/stefan.evert RU Groningen | 10.05.2011 multiword expressions (phraseology) Collocations vs. multiword expressions 2 empirical collocations (Firth / Sinclair) based on native speaker judgments linguistic criteria/tests dierent subtypes still no precise def. & subtype classification “grey area” between idiomatic MWE and free combinations quantitative empirical phenomenon based on corpus data linguistic status? words collocate for dierent “reasons” unclear: appropriate operationalisation (“collocation measure”) multiword expressions Types & examples of multiword expressions 3 idioms figurative expressions lexical collocations light verbs (SVC, FVG) complex lexical items (MWU) English noun compounds named entities particle verbs (VPC) (multiword) terminology institutionalised phrases & clichés kick the bucket eat humble pie brush one's teeth make tea heavy smoker give up hand in in front of déjà vu to and fro give a talk draw a conclusion spill the beans law of mass action white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4 compositionality semantic dimension modifiability syntactic dimension substitutability lexical dimension compositional syntax opaque idiom decomposable metaphor semi-compositional pragmatic aspects rigid MWU n-gram limited variability semi-fixed construction morphosyntactic preferences LWC completely determined (no substitution) productive MWE pattern selectional restrictions partly determined

Upload: others

Post on 14-Sep-2019

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Statistical Association and Multiword Expressions

Stefan EvertInstitute of Cognitive Science

University of Osnabrück

[email protected] | purl.org/stefan.evert

RU Groningen | 10.05.2011

multiword expressions (phraseology)

Collocations vs. multiword expressions

2

empirical collocations(Firth / Sinclair)

• based on native speaker judgments

• linguistic criteria/tests• different subtypes• still no precise def. &

subtype classification• “grey area” between

idiomatic MWE and free combinations

• quantitative empirical phenomenon

• based on corpus data• linguistic status?

words collocate for different “reasons”

• unclear: appropriate operationalisation (“collocation measure”)

multiword expressions

Types & examples of multiword expressions

3

idioms

figurativeexpressions lexical

collocations

light verbs(SVC, FVG)

complex lexicalitems (MWU)

Englishnoun compounds

namedentities

particle verbs(VPC)

(multiword)terminology

institutionalisedphrases & clichés

kick the bucketeat humble pie

brush one's teethmake teaheavy smoker

give uphand in

in front ofdéjà vuto and fro

give a talkdraw a conclusion

spill the beans

law of mass actionwhite dwarf

apple juiceskeleton key

bucket and spadefish and chips

Red Cross New York City

Scales of MWE-ness

4

compositionalitysemantic dimension

modifiabilitysyntactic dimension

substitutabilitylexical dimension

compositional syntax

opaqueidiom

decomposablemetaphor

semi-compositional

pragmatic aspects

rigidMWU

n-gram

limited variability

semi-fixed constructionmorphosyntacticpreferences

LWC

completelydetermined

(no substitution)

productive MWEpattern

selectionalrestrictions

partlydetermined

Page 2: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

multiword expressions

Types & examples of multiword expressions

5

idioms

figurativeexpressions lexical

collocations

light verbs(SVC, FVG)

complex lexicalitems (MWU)

Englishnoun compounds

namedentities

particle verbs(VPC)

(multiword)terminology

institutionalisedphrases & clichés

Produced with UCS toolkit | http://www.collocations.de/software.html

Examples of collocations (BNC)

6

bucketbucketbucketcollocate f G2

water 183 1064.079

spade 31 341.138

bucket 34 306.078

plastic 36 243.863

slop 15 213.303

mop 18 207.117

size 42 200.162

fill 38 195.749

record 42 174.693

throw 35 172.264

into 87 149.409

empty 18 148.807

with 191 147.546

ice 22 131.697

randomize 9 115.335

kick 19 113.238

of 488 81.765

single-record 5 81.107

large 36 80.852

shop 23 80.794

seat 20 78.645

roseroserosecollocate f G2

red 209 1113.501

shrub 68 572.793

hilaire 46 561.504

garden 133 548.936

cottage 75 418.288

bowl 66 389.237

petal 42 361.140

bush 65 324.711

net 63 321.964

white 104 319.160

pink 54 299.730

rose 66 285.331

mid-term 27 276.903

gun 60 271.541

axl 20 269.491

mary 64 265.474

wild 58 257.805

flower 63 251.148

per 74 231.465

miss 62 225.623

floyd 26 218.556

Produced with UCS toolkit | http://www.collocations.de/software.html

Examples of collocations (BNC)

7

bucket: nounsbucket: nounsbucket: nounscollocate f G2

water 183 1064.079

spade 31 341.138

bucket 34 306.078

plastic 36 243.863

slop 15 213.303

mop 18 207.117

size 42 200.162

record 42 174.693

ice 22 131.697

shop 23 80.794

seat 20 78.645

sand 13 68.814

brigade 10 67.080

shovel 7 64.335

coal 14 63.609

oats 7 62.659

rhino 7 60.813

champagne 10 59.556

density 10 59.132

algorithm 8 57.552

container 9 54.561

bucket: verbsbucket: verbsbucket: verbscollocate f G2

fill 38 195.749

throw 35 172.264

empty 18 148.807

randomize 9 115.335

kick 19 113.238

put 38 66.174

hold 31 62.765

tip 10 61.670

carry 25 59.554

fetch 9 52.665

chuck 7 50.638

store 10 48.327

pour 10 47.206

weep 7 43.396

douse 4 37.842

used 13 31.791

pack 7 29.582

use 33 28.469

slop 3 27.238

drop 10 26.855

clean 7 26.830

bucket: adjectivesbucket: adjectivesbucket: adjectivescollocate f G2

single-record 5 81.107

large 36 80.852

cold 17 63.644

galvanized 4 51.373

full 22 49.746

steaming 4 32.883

leaky 3 29.520

empty 8 28.670

bottomless 3 28.397

galvanised 3 27.186

soggy 3 25.022

iced 3 24.535

small 20 24.033

clean 7 23.416

bowed 2 20.506

omnipresent 2 19.811

anglo-saxon 3 18.219

wooden 5 17.251

ice-cold 2 17.211

soapy 2 16.005

ten 10 15.864

http://beta.sketchengine.co.uk/

Word sketch

8

Page 3: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

What are collocations?

9

bucket: nounsbucket: nounsbucket: nounscollocate f G2

water 183 1064.079

spade 31 341.138

bucket 34 306.078

plastic 36 243.863

slop 15 213.303

mop 18 207.117

size 42 200.162

record 42 174.693

ice 22 131.697

shop 23 80.794

seat 20 78.645

sand 13 68.814

brigade 10 67.080

shovel 7 64.335

coal 14 63.609

oats 7 62.659

rhino 7 60.813

champagne 10 59.556

density 10 59.132

algorithm 8 57.552

container 9 54.561

bucket: verbsbucket: verbsbucket: verbscollocate f G2

fill 38 195.749

throw 35 172.264

empty 18 148.807

randomize 9 115.335

kick 19 113.238

put 38 66.174

hold 31 62.765

tip 10 61.670

carry 25 59.554

fetch 9 52.665

chuck 7 50.638

store 10 48.327

pour 10 47.206

weep 7 43.396

douse 4 37.842

used 13 31.791

pack 7 29.582

use 33 28.469

slop 3 27.238

drop 10 26.855

clean 7 26.830

bucket: adjectivesbucket: adjectivesbucket: adjectivescollocate f G2

single-record 5 81.107

large 36 80.852

cold 17 63.644

galvanized 4 51.373

full 22 49.746

steaming 4 32.883

leaky 3 29.520

empty 8 28.670

bottomless 3 28.397

galvanised 3 27.186

soggy 3 25.022

iced 3 24.535

small 20 24.033

clean 7 23.416

bowed 2 20.506

omnipresent 2 19.811

anglo-saxon 3 18.219

wooden 5 17.251

ice-cold 2 17.211

soapy 2 16.005

ten 10 15.864

idiom compound technical lexical collocation semantic relation facts of life

Multiword Expressions (MWE)

Why collocations are important

10

Primary tool for MWE identification■ e.g. Evert/Krenn (2001, 2005) | MWE Workshops & Shared Task

Language description: approximation of word meaning■ Firth (1957) | Sinclair (1991) | computational lexicography

Psycholinguistic relevance: priming & syntactic associates■ priming effects | lexical priming (Hoey 2005) | link grammar etc.

Collostructions, subcategorisation & selectional preferences■ "collocations" between words & syntactic patterns

Applications in NLP, e.g. long-distance adaptors for LM

Basis of distributional semantic models (term-term matrix)

Key questions for MWE and collocations

11

Linguistic definition of MWE and their subtypes

Relation between (different subtypes of) MWE and(different quantitative notions of) empirical collocations

Operationalisation of empirical collocations andappropriate quantitative measures

12

Co-occurrence and statistical association

Page 4: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

13

Operationalising collocations

Early "definitions"■ recurrent, habitual word combinations (Firth 1957)■ greater than chance co-occurrence (Sinclair 1966, 1970)■ significant collocations (Kilgarriff & Tugwell 2002)

Ingredient 1: co-occurrence■ surface vs. textual vs. syntactic (Evert 2004, 2008)■ contingency tables of joint & marginal frequencies

Ingredient 2: statistical association■ quantitative measure for tendency of events to co-occur■ operationalises intuition of recurrent, “salient” combinations

Collocational span of 4 words (L4, R4), limited by sentence boundaries

Surface co-occurrence

14

A vast deal of coolness and a peculiar degree of judgement, are requisite in catching a hat . A man must

not be precipitate, or he runs over it ; he must not rush into the opposite extreme, or he loses it

altogether. [. . . ] There was a fine gentle wind, and Mr. Pickwick’s hat rolled sportively before it . The

wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolled over and over as merrily as a lively porpoise

in a strong tide ; and on it might have rolled, far beyond Mr. Pickwick’s reach, had not its course been

providentially stopped, just as that gentleman was on the point of resigning it to its fate.

f(hat, roll) = 2

Co-occurrence within sentences

Textual co-occurrence

15

f(hat, over) = 1

A vast deal of coolness and a peculiar degree of judgement, arerequisite in catching a hat.

hat —

A man must not be precipitate, or he runs over it ; — over

he must not rush into the opposite extreme, or he loses italtogether.

— —

There was a fine gentle wind, and Mr. Pickwick’s hat rolledsportively before it.

hat —

The wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolledover and over as merrily as a lively porpoise in a strong tide ;

hat over

Adjectival noun modification (prenominal adjectives)

Syntactic co-occurrence

16

f(young, gentleman) = 1

In an open barouche [. . . ] stood a stout old gentleman, in a blue coat

and bright buttons, corduroy breeches and top-boots ; two

young ladies in scarfs and feathers ; a young gentleman apparently

enamoured of one of the young ladies in scarfs and feathers ; a lady

of doubtful age, probably the aunt of the aforesaid ; and [. . . ]

!

open barouchestout gentleman

old gentlemanblue coat

bright buttonyoung ladyyoung gentlemanyoung lady

doubtful age

Page 5: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Observed frequency

17

Collocations: “recurrent” combinations→ simply use co-occurrence frequency as measure of salience?

Example: most frequent adjacent bigrams from Brown corpus

Frequent combinations don't seemto be very interesting collocations

Mathematical reason:■ f(is to) = 260■ f(is) ≈ 10,000, f(to) ≈ 26,000■ one would expect 260 co-occurrences

if words were ordered randomly!

adjacent bigrams (Brown)adjacent bigrams (Brown)adjacent bigrams (Brown)bigram f rank

of the 9702 1

in the 6018 2

to the 3478 3

on the 2459 4

and the 2242 5

for the 1845 6

to be 1715 7

at the 1654 8

with the 1530 9

it is 1482 10

of a 1469 11

in a 1413 12

from the 1410 13

that the 1378 14

by the 1347 15

it was 1338 16

he was 1110 17

as a 980 18

he had 933 19

… … …

is to 260 133

adjacent bigrams (Brown)adjacent bigrams (Brown)adjacent bigrams (Brown)adjacent bigrams (Brown)bigram f expected rank

of the 9702 2186.75 1

in the 6018 1260.22 2

to the 3478 1613.01 3

on the 2459 384.84 4

and the 2242 1768.75 5

for the 1845 571.23 6

to be 1715 173.16 7

at the 1654 323.29 8

with the 1530 427.89 9

it is 1482 87.02 10

of a 1469 759.86 11

in a 1413 437.91 12

from the 1410 258.53 13

that the 1378 650.83 14

by the 1347 322.97 15

it was 1338 86.32 16

he was 1110 99.98 17

as a 980 155.42 18

he had 933 53.02 19

… … … …

is to 260 266.61 133

Observed & expected frequency

18

Collocations: “recurrent” combinations→ use co-occurrence frequency as measure of salience

Example: most frequent adjacent bigrams from Brown corpus

Frequent combinations don't seemto be very interesting collocations

Mathematical reason:■ f(is to) = 260■ f(is) ≈ 10,000, f(to) ≈ 26,000■ one would expect 260 co-occurrences

if words were ordered randomly!

Contingency table = cross-classification of “items”■ mathematical basis for concept of statistical association

Statistics tells us how to calculate expected cell counts

Observed & expected contingency tables

19

w2 ¬w2 w2 ¬w2

w1 O11 O12 ! R1 w1 E11 !

R1C1

NE12 !

R1C2

N

¬w1 O21 O22 ! R2 ¬w1 E21 !

R2C1

NE22 !

R2C2

N

! C1 ! C2 ! N

observed expected

Item = sentence (or other text segment)

Textual co-occurrence

20

f(hat, over) = 1sample size N = 5

A vast deal of coolness and a peculiar degree of judgement, arerequisite in catching a hat.

hat —

A man must not be precipitate, or he runs over it ; — over

he must not rush into the opposite extreme, or he loses italtogether.

— —

There was a fine gentle wind, and Mr. Pickwick’s hat rolledsportively before it.

hat —

The wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolledover and over as merrily as a lively porpoise in a strong tide ;

hat over

over ¬over

hat

¬hat

O11 O12 R1

O21 O22 R2

C1 C2 N

Page 6: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Item = sentence (or other text segment)

Textual co-occurrence

21

f(hat, over) = 1sample size N = 5

A vast deal of coolness and a peculiar degree of judgement, arerequisite in catching a hat.

hat —

A man must not be precipitate, or he runs over it ; — over

he must not rush into the opposite extreme, or he loses italtogether.

— —

There was a fine gentle wind, and Mr. Pickwick’s hat rolledsportively before it.

hat —

The wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolledover and over as merrily as a lively porpoise in a strong tide ;

hat over

over ¬over

hat

¬hat

1 2 3

1 1 2

2 3 5

Item = instance of adjective-noun modification

Syntactic co-occurrence

22

f(young, gentleman) = 1sample size N = 9

In an open barouche [. . . ] stood a stout old gentleman, in a blue coat

and bright buttons, corduroy breeches and top-boots ; two

young ladies in scarfs and feathers ; a young gentleman apparently

enamoured of one of the young ladies in scarfs and feathers ; a lady

of doubtful age, probably the aunt of the aforesaid ; and [. . . ]

!

open barouchestout gentleman

old gentlemanblue coat

bright buttonyoung ladyyoung gentlemanyoung lady

doubtful age

•|gent. •|¬gent

young|•

¬young|•

1 2 3

2 4 6

3 6 9

Item = token

Surface co-occurrence

23

A vast deal of coolness and a peculiar degree of judgement, are requisite in catching a hat . A man must

not be precipitate, or he runs over it ; he must not rush into the opposite extreme, or he loses it

altogether. [. . . ] There was a fine gentle wind, and Mr. Pickwick’s hat rolled sportively before it . The

wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolled over and over as merrily as a lively porpoise

in a strong tide ; and on it might have rolled, far beyond Mr. Pickwick’s reach, had not its course been

providentially stopped, just as that gentleman was on the point of resigning it to its fate.

f(hat, roll) = 2sample size N = 108

roll ¬roll

NEAR(hat)

¬NEAR(hat)

2 18 20

1 87 88

3 105 108

Data from BNC | RASP parser: http://www.informatics.susx.ac.uk/research/nlp/rasp/

Comparison

24

"sell" - V+Obj"sell" - V+Objcollocate G2

goods 1391.9

share 1231.6

product 946.0

house 665.4

property 602.1

land 478.0

ticket 453.3

asset 413.7

copy 399.4

car 306.8

business 246.8

stock 224.1

stake 205.5

home 174.3

liquor 167.0

soul 166.0

bond 141.9

produce 138.3

company 110.8

unit 105.5

painting 105.2

"sell" - (L0, R2)"sell" - (L0, R2)collocate G2

share 1312.8

goods 1309.6

product 1051.1

house 707.3

ticket 611.0

property 460.9

land 388.4

copy 383.5

car 353.6

auction 276.3

soul 236.8

liquor 223.7

asset 182.1

produce 166.9

ware 156.5

bond 149.5

insurance 144.5

stake 131.2

stock 125.6

advertising 112.6

cigarette 103.9

"sell" - (L5, R5)"sell" - (L5, R5)collocate G2

goods 2177.7

product 1752.2

share 1749.0

copy 1341.3

shop 1232.0

ticket 1146.4

property 903.0

company 869.1

price 774.8

house 728.4

dealer 632.8

car 595.5

land 589.9

asset 572.4

market 524.0

stock 501.7

business 490.5

auction 449.7

stake 362.0

liquor 335.4

store 298.6

"sell" - sentence"sell" - sentencecollocate G2

price 3496.8

company 2661.1

market 2600.8

share 2593.6

goods 2435.3

product 2363.9

shop 1922.8

sale 1460.9

copy 1418.9

dealer 1381.7

property 1348.2

business 1347.8

sales 1171.4

stock 1149.4

ticket 1145.2

profit 1109.3

buyer 1076.7

house 1048.2

auction 916.3

owner 876.1

asset 873.6

Page 7: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Parsing accuracy

25

How reliable is syntactic co-occurrence?

Evert/Kermes (2003) evaluate adjective-noun identification■ German prenominal

adjectives■ TIGER Treebank used

as gold standard

Verb-object and verb-subject relations are much harder■ Charniak-Johnson parser achieves 89.3% (direct object) and

96.5% (subject) on examples sentences from English Wiktionary■ more difficult for languages with free word order (German)

perfect tagging TreeTagger tagging

candidates

from precision recall precision recall

adjacent pairs 98.47% 90.58% 94.81% 84.85%

window-based 97.14% 96.74% 93.85% 90.44%

YAC chunks 98.16% 97.94% 95.51% 91.67%

Table 1: Results for Adj+N extraction task

forward stochastic tagger and naive window-based

extraction precision and recall values above 90%

provide an excellent starting point for statistical

ranking.

The considerable differences between the sets

of pair types primarily affect hapaxes and have

little impact on statistical methods for colloca-

tion identification (where hapaxes are rarely found

among the higher-ranking candidates). Likewise,

small changes in the frequency profiles of more

frequent pairs have little impact on the association

scores and the precise ranking of the candidates. It

will be interesting to see how these results trans-

late to more demanding extraction tasks.

References

Kenneth W. Church and Patrick Hanks. 1990. Wordassociation norms, mutual information, and lexicog-raphy. Computational Linguistics, 16(1):22–29.

Beatrice Daille. 1994. Approche mixte pourl’extraction automatique de terminologie : statis-tiques lexicales et filtres linguistiques. Ph.D. thesis,Universite Paris 7.

Stefan Evert and Brigitte Krenn. 2001. Methods forthe qualitative evaluation of lexical association mea-sures. In Proceedings of the 39th Annual Meetingof the Association for Computational Linguistics,Toulouse, France.

Hannah Kermes and Stefan Evert. 2002. YAC – arecursive chunker for unrestricted german text. InManuel Gonzalez Rodriguez and Carmen PazSuarez

Araujo, editors, Proceedings of the Third Interna-tional Conference on Language Resources and Eval-uation (LREC), volume V, pages 1805–1812, LasPalmas, Spain.

Brigitte Krenn. 2000. The Usual Suspects: Data-Oriented Models for the Identification and Repre-sentation of Lexical Collocations. DFKI & Univer-sitat des Saarlandes, Saarbrucken.

Andreas Mengel and Wolfgang Lezius. 2000. AnXML-based representation format for syntacticallyannotated corpora. In Proceedings of the SecondInternational Conference on Language Resourcesand Engineering (LREC), volume 1, pages 121–126,Athens, Greece.

Helmut Schmid. 1994. Probabilistic part-of-speechtagging using decision trees. In International Con-ference on New Methods in Language Processing,pages 44–49, Manchester, UK.

Wojciech Skut, Brigitte Krenn, Thorsten Brants, andHans Uszkoreit. 1997. An annotation scheme forfree word order languages. In Proceedings of theFifth Conference on Applied Natural Language Pro-cessing ANLP-97, Washington, DC.

W. Skut, T. Brants, B. Krenn, and H. Uszkoreit. 1998.A linguistically interpreted corpus of german news-paper texts. In Proceedings of the ESSLLI Work-shop on Recent Advances in Corpus Annotation,Saarbrucken, Germany.

Frank Smadja. 1993. Retrieving collocations fromtext: Xtract. Computational Linguistics, 19(1):143–177.

See Evert (2004, 2008) for details | http://www.collocations.de/

Association measures (AM)

26

w2 ¬w2 w2 ¬w2

w1 O11 O12 ! R1 w1 E11 !

R1C1

NE12 !

R1C2

N

¬w1 O21 O22 ! R2 ¬w1 E21 !

R2C1

NE22 !

R2C2

N

! C1 ! C2 ! N

observed expected

Observed (O) vs. expected (E) co-occurrence frequency

Simple measures

27

MI ! log2

O

EMIk ! log2

Ok

Elocal-MI ! O · log2

O

E

z-score !O ! E"

Et-score !

O ! E"

Osimple-ll ! 2

!

O · logO

E! "O ! E#

"

Comparison of full contingency tables (observed vs. expected)

Statistical measures

28

MI ! log2

O

EMIk ! log2

Ok

Elocal-MI ! O · log2

O

E

z-score !O ! E"

Et-score !

O ! E"

Osimple-ll ! 2

!

O · logO

E! "O ! E#

"

chi-squared !!

ij

"Oij ! Eij#2

Eijchi-squaredcorr !

N"

|O11O22 ! O12O21| !N/2#2

R1R2C1C2

log-likelihood ! 2!

ij

Oij logOij

Eijaverage-MI !

!

ij

Oij · log2

Oij

Eij

Dice !2O11

R1 $ C1odds-ratio ! log

"

O11 $ 12

#"

O22 $ 12

#

"

O12 $ 12

#"

O21 $ 12

#

Page 8: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

See Evert (2004, 2008) for details | http://www.collocations.de/

Association measures (AM)

29

recommendedmeasures(Evert 2008)

MI ! log2

O

EMIk ! log2

Ok

Elocal-MI ! O · log2

O

E

z-score !O ! E"

Et-score !

O ! E"

Osimple-ll ! 2

!

O · logO

E! "O ! E#

"

chi-squared !!

ij

"Oij ! Eij#2

Eijchi-squaredcorr !

N"

|O11O22 ! O12O21| !N/2#2

R1R2C1C2

log-likelihood ! 2!

ij

Oij logOij

Eijaverage-MI !

!

ij

Oij · log2

Oij

Eij

Dice !2O11

R1 $ C1odds-ratio ! log

"

O11 $ 12

#"

O22 $ 12

#

"

O12 $ 12

#"

O21 $ 12

#

Collocates of "bucket" in BNC (from Evert 2008)

Comparison

30

collocate f f2 simple-ll

water 184 37012 1083.18a 590 2164246 449.30spade 31 465 342.31plastic 36 4375 247.65size 42 14448 203.36slop 17 166 202.30mop 20 536 197.68throw 38 11308 194.66fill 37 10722 191.44with 196 658584 171.78

collocate f f2 MI

fourteen-record 4 4 13.31ten-record 3 3 13.31multi-record 2 2 13.31two-record 2 2 13.31a-row 1 1 13.31anti-sweat 1 1 13.31axe-blade 1 1 13.31bastarding 1 1 13.31dippermouth 1 1 13.31Dok 1 1 13.31

collocate f f2 t-score

a 590 2164246 15.53water 184 37012 13.30and 479 2616723 10.14with 196 658584 9.38of 497 3040670 8.89the 832 6041238 8.26into 87 157565 7.67size 42 14448 6.26in 298 1937966 6.23record 43 29404 6.12

collocate f ! 5 f2 MI

single-record 5 8 12.63randomize 10 57 10.80slop 17 166 10.03spade 31 465 9.41mop 20 536 8.57oats 7 286 7.96shovel 8 358 7.83rhino 7 326 7.77synonym 7 363 7.62bucket 18 1356 7.08

Pecina (2005) collects 57 association measures (and some other formulae)

31

So many measures, so little time …

# Name Formula

1. Mean component offset 1n

Pni=1 di

2. Variance component offset 1n−1

Pni=1

`di−d

´2

3. Joint probability P (xy)

4. Conditional probability P (y|x)

5. Reverse conditional prob. P (x|y)

�6. Pointwise mutual inform. logP (xy)

P (x∗)P (∗y)

7. Mutual dependency (MD) logP (xy)2

P (x∗)P (∗y)

8. Log frequency biasedMD logP (xy)2

P (x∗)P (∗y)+log P (xy)

9. Normalized expectation2f(xy)

f(x∗)+f(∗y)�10. Mutual expectation

2f(xy)f(x∗)+f(∗y) ·P (xy)

11. Salience logP (xy)2

P (x∗)P (∗y) · logf(xy)

12. Pearson’s χ2 testP

i,j

(fij−fij)2

fij

13. Fisher’s exact testf(x∗)!f(x∗)!f(∗y)!f(∗y)!

N!f(xy)!f(xy)!f(xy)!f(xy)!

14. t test f(xy)−f(xy)√f(xy)(1−(f(xy)/N))

15. z score f(xy)−f(xy)√f(xy)(1−(f(xy)/N))

16. Poison significance measuref(xy)−f(xy) logf(xy)+logf(xy)!

logN

17. Log likelihood ratio −2P

i,jfij logfij

fij

18. Squared log likelihood ratio −2P

i,j

logfij2

fij

Association coefficients:

19. Russel-Rao aa+b+c+d

20. Sokal-Michiner a+da+b+c+d

�21. Rogers-Tanimoto a+da+2b+2c+d

22. Hamann(a+d)−(b+c)

a+b+c+d

23. Third Sokal-Sneath b+ca+d

24. Jaccard aa+b+c

�25. First Kulczynsky ab+c

26. Second Sokal-Sneath aa+2(b+c)

27. Second Kulczynski 12 ( a

a+b + aa+c )

28. Fourth Sokal-Sneath 14 ( a

a+b + aa+c + d

d+b + dd+c )

29. Odds ratio adbc

30. Yulle’s ω√

ad−√

bc√ad+

√bc

�31. Yulle’sQ ad−bcad+bc

32. Driver-Kroeber a√(a+b)(a+c)

33. Fifth Sokal-Sneath ad√(a+b)(a+c)(d+b)(d+c)

34. Pearson ad−bc√(a+b)(a+c)(d+b)(d+c)

35. Baroni-Urbani a+√

ad

a+b+c+√

ad

36. Braun-Blanquet amax(a+b,a+c)

37. Simpson amin(a+b,a+c)

38. Michael4(ad−bc)

(a+d)2+(b+c)2

39. Mountford 2a2bc+ab+ac

40. Fager a√(a+b)(a+c)

− 12max(b, c)

41. Unigram subtuples log adbc−3.29

q1a + 1

b + 1c + 1

d

42. U cost log(1+min(b,c)+amax(b,c)+a )

43. S cost log(1+min(b,c)

a+1 )−1

2

44. R cost log(1+ aa+b )·log(1+ a

a+c )

45. T combined cost√

U×S×R

46. PhiP (xy)−P (x∗)P (∗y)√

P (x∗)P (∗y)(1−P (x∗))(1−P (∗y))

47. KappaP (xy)+P (xy)−P (x∗)P (∗y)−P (x∗)P (∗y)

1−P (x∗)P (∗y)−P (x∗)P (∗y)

48. J measure max[P (xy)logP (y|x)P (∗y) +P (xy)log

P (y|x)P (∗y) ,

P (xy)logP (x|y)P (x∗) +P (xy)log

P (x|y)P (x∗) ]

# Name Formula

49. Gini index max[P (x∗)(P (y|x)2+P (y|x)2)−P (∗y)2

+P (x∗)(P (y|x)2+P (y|x)2)−P (∗y)2,

P (∗y)(P (x|y)2+P (x|y)2)−P (x∗)2

+P (∗y)(P (x|y)2+P (x|y)2)−P (x∗)2]

50. Confidence max[P (y|x), P (x|y)]

51. Laplace max[ NP (xy)+1NP (x∗)+2 , NP (xy)+1

NP (∗y)+2 ]

52. Conviction max[ P (x∗)P (∗y)P (xy) , P (x∗)P (∗y)

P (xy) ]

53. Piatersky-Shapiro P (xy)−P (x∗)P (∗y)

54. Certainity factor max[ P (y|x)−P (∗y)1−P (∗y) , P (x|y)−P (x∗)

1−P (x∗) ]

55. Added value (AV) max[P (y|x)−P (∗y), P (x|y)−P (x∗)]�56. Collective strength

P (xy)+P (xy)P (x∗)P (y)+P (x∗)P (∗y) ·

1−P (x∗)P (∗y)−P (x∗)P (∗y)1−P (xy)−P (xy)

57. Klosgenp

P (xy) ·AV

Context measures:�58. Context entropy −

Pw P (w|Cxy) logP (w|Cxy)

59. Left context entropy −P

w P (w|Clxy) logP (w|Cl

xy)

60. Right context entropy −P

w P (w|Crxy) logP (w|Cr

xy)

�61. Left context divergence P (x∗) logP (x∗)−

PwP (w|Cl

xy) logP (w|Clxy)

62. Right context divergence P (∗y) logP (∗y)

−P

wP (w|Crxy) logP (w|Cr

xy)

63. Cross entropy −P

wP (w|Cx) log P (w|Cy)

64. Reverse cross entropy −P

wP (w|Cy) log P (w|Cx)

65. Intersection measure2|Cx∩Cy||Cx|+|Cy|

66. Euclidean normqP

w(P (w|Cx)−P (w|Cy))2

67. Cosine norm

Pw P (w|Cx)P (w|Cy)

Pw P (w|Cx)2·

Pw P (w|Cy)2

68. L1 normP

w |P (w|Cx)−P (w|Cy)|

69. Confusion probabilityP

wP (x|Cw)P (y|Cw)P (w)

P (x∗)

70. Reverse confusion prob.P

wP (y|Cw)P (x|Cw)P (w)

P (∗y)�71. Jensen-Shannon diverg. 1

2 [D(p(w|Cx)|| 12 (p(w|Cx)+p(w|Cy)))

+D(p(w|Cy)|| 12 (p(w|Cx)+p(w|Cy)))]

72. Cosine of pointwiseMIP

w MI(w,x)MI(w,y)√Pw MI(w,x)2·

√Pw MI(w,y)2

�73. KL divergenceP

w P (w|Cx) logP (w|Cx)P (w|Cy)

�74. Reverse KL divergenceP

w P (w|Cy) logP (w|Cy)P (w|Cx)

75. Skew divergence D(p(w|Cx)||α(w|Cy)+(1−α)p(w|Cx))

76. Reverse skew divergence D(p(w|Cy)||αp(w|Cx)+(1−α)p(w|Cy))

77. Phrase word coocurrence 12 (

f(x|Cxy)f(xy) +

f(y|Cxy)f(xy) )

78.Word association 12 (

f(x|Cy)−f(xy)f(xy) + f(y|Cx)−f(xy)

f(xy) )

Cosine context similarity: 12 (cos(cx,cxy)+cos(cy,cxy))

cz=(zi); cos(cx,cy)=P

xiyi√Pxi

2·√P

yi2

�79. in boolean vector space zi =δ(f(wi|Cz))

80. in tf vector space zi =f(wi|Cz)

81. in tf·idf vector space zi =f(wi|Cz)· Ndf(wi)

; df(wi)= |{x :wi�Cx}|

Dice context similarity: 12 (dice(cx,cxy)+dice(cy,cxy))

cz=(zi); dice(cx,cy)=2

PxiyiP

xi2+

Pyi

2�82. in boolean vector space zi =δ(f(wi|Cz))

�83. in tf vector space zi =f(wi|Cz)�84. in tf·idf vector space zi =f(wi|Cz)· N

df(wi); df(wi)= |{x :wi�Cx}|

Linguistic features:�85. Part of speech {Adjective:Noun, Noun:Noun, Noun:Verb, . . . }

�86. Dependency type {Attribute, Object, Subject, . . . }

87. Dependency structure {�,�}

?32

Which measure?

Page 9: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

33

How to choose an association measure

Mathematical discussion

Direct comparison

Task-based evaluation

Geometric interpretation■ combine with insights from task-based evaluation

Poisson-likelihood � e−E · �E�O

O!

34

Significance of association

log-likelihood � 2�

ij

Oij logOij

Eij

Fisher �min{R1,C1}�

k�O11

�C1

k

�·�

C2

R1 − k

�N

R1

exact hypothesis tests

asymptotic hypothesis tests

likelihood measures

simple-ll � 2 ·�O · log

O

E− �O − E�

Poisson-Stirling � O ·�logO − logE − 1

t-score �O − E√

O

z-score �O − E√

E

simple hypothesis tests

chi-squared ��

i,j

�Oij − Eij�2

Eij

�N

E22· �O − E�2

E

Poisson �∞�

k�O

e−E E

k

k!

gmean �√pF · pB �

O11√R1C1

�O√NE

MI � log2O

E

35

Degree of association / determination

relative-risk � logO11C2

O12C1

odds-ratio � logO11O22

O12O21

gmean �O11√R1C1

�O11√NE11

pF � Pr�w2 | w1�pB � Pr�w1 | w2�

Dice ��

12pF

�1

2pB

�−1

�2O11

R1 � C1

MS � min{pF, pB} � min�O11

R1,O11

C1

�O11

max{R1, C1}measures ofnon-independence

measures of(mutual) determination

Comparison of p-values on simulated data (see Evert 2004, 2008)

Direct comparison of association scores

36

1 10 100 1000 10000

110

100

1000

1000

0Fisher

(see

lege

nd)

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!!!

!!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!!!

!!

!

!!

!

!

!!

!

!!

!

!!

!

!

!

!

!

!

!

!!

!

!!

!!

!

!

!

!

!

!

!!!

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!!

!!!

!

!

!!

!

!

!

!

!

!!

!!!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!!!!

!!!

!

!

!!!

!

!!

!

!

!

!

!

!

!!!

!

!

!!

!

!

!

!

!

!

!

!!

!

!!

!!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!

!!

!

!

!

!

!!

!

!!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!!

!

!!

!

!

!

!

!

!

!!

!

!

!

!

!!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!!

!

!!

!

!!

!!

!!

!!!

!!

!

!!

!

!

!!

!

!!

!

!!

!

!

!

!

!

!

!

!!

!

!!

!!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!!

!!

!

!

!

!!

!

!

!

!

!

!!

!!!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!!!!

!!!

!

!

!!!

!

!!

!

!

!

!

!

!

!!!

!

!

!!

!

!

!

!

!

!

!

!!

!

!!

!!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!

!!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!!

!!

!!

!!!

!!

!!!

!

!

!!

!

!!!

!!

!

!

!

!

!

!

!

!!!

!!

!!!

!

!!

!!

!!!

!!

!

!

!

!

!

!

!

!

!

!!!

!!

!!

!!!

!

!

!!

!

!

!

!

!

!!

!!!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!!

!

!!

!!

!!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!!

!

!!

!

!!

!

!!!

!

!!!

!

!!!!

!!!!

!

!!

!

!

!!

!

!

!!!

!

!

!!

!

!

!

!

!

!

!

!!

!

!!

!!

!

!

!

!

!!

!!

!

!

!

!

!

!!

!

!

!!

!!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!!!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!!

!

!!

!

!!

!

!

!

!

!

!

!!

!

!!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!! !

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!

!!

!

!

!!

!

!

!

!

!

!!

!

!!!

!!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!!!!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!!

!

!

!!

!

!

!

!!!

!

!

!

!

!

!!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!!

!

!

!!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!!

!

!!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!!

!!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!!

!

!

!!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!!

!!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!

!

!!

!

!!

!!

!

!

!

!

chi−squaredPoissonlog−likelihoodt−score

Page 10: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Me

an

ave

rag

e p

recis

ion

0.2

0.3

0.4

0.5

0.6

0.7

77 80 32 30 10 42 4 28 63 22 23 7 20 19 43 6 9 50 48 8 59 73 61 25 11 74 68 53 52 35 41 55 47 81 46 2 51 78 58 57 1739 38 31 13 5 37 27 29 16 24 45 33 21 18 34 54 76 3 82 44 66 71 26 15 14 72 70 64 49 65 69 40 75 56 12 60 36 79 62 1 67 77 38 30 5 4 29 22 45 20 18 6 76 48 44 73 26 11 72 53 49 41 40 81 12 51 79 57 67

67577951128140414953721126734448766182045222945303877

Figure 3: a) Mean average precision of all association measures in descending order. Methods are referred by numbersfrom Table 1. The solid points correspond to measures selected by the model reduction algorithm from Section 5. b) Visu-alization of p-values from the significance tests of difference between each method pair (order is the same for both graphs). Thedarker points correspond to p-values greater than α=0.1 and indicate methods with statistically indistinguishable performance(measured by paired Wilcoxon test on values of average precision obtained from five independent data folds).

3.4 Experiments and results

In the initial experiments, we implemented all 82association measures from Table 1, processed allmorphologically and syntactically annotated sen-tences from PDT 2.0, and computed scores of allthe association measures for each dependency bi-gram in the reference data. For each associa-tion measure and each of the five evaluation datafolds, we computed precision-recall scores anddrew an averaged precision-recall curve. Curvesof some well-performing methods are depicted inFigure 2. Next, for each association measure andeach data fold, we estimated scores of average pre-cision on narrower recall interval �0.1,0.9�, com-puted mean average precision, ranked the asso-ciation measures according to MAP in descend-ing order, and result depicted in Figure 3 a). Fi-nally, we applied a paired Wilcoxon test, detectedmeasures with statistically indistinguishable per-formance, and visualized this information in Fig-ure 3 b).

A baseline system ranking bigrams randomlyoperates with average precision of 20.9%. Thebest performing method for collocation extrac-tion measured by mean average precision is co-sine context similarity in boolean vector space (77)(MAP 66.49%) followed by other 16 associa-tion measures with nearly identical performance(Figure 3 a). They include some popular meth-ods well-known to perform reliably in this task,such as pointwise mutual information (4), Pear-son’s χ2 test (10), z score (13), odds ratio (27), orsquared log likelihood ratio (16).

The interesting point to note is that, in termsof MAP, context similarity measures, e.g. (77),slightly outperform measures based on simple oc-

curence frequencies, e.g. (39). In a more thoroughcomparison by percision-recall curves, we observethat the former very significantly predominates thelatter in the first half of the recall interval and viceversa in the second half (Figure 2). This is a casewhere the MAP is not a sufficient metric for com-parison of association measure performance. It isalso worth pointing out that even if two methodshave the same precision-recall curves the actual bi-gram rank order can be very different. Existenceof such non-correlated (in terms of ranking) mea-sures will be essential in the following sections.

4 Combining association measures

Each collocation candidate xi can be described bythe feature vector xi = (xi

1, . . . , xi82)T consisting

of 82 association scores from Table 1 and assigneda label yi ∈ {0, 1} which indicates whether thebigram is considered to be a collocation (y = 1)or not (y = 0). We look for a ranker functionf(x)→R that determines the strength of lexicalassociation between components of bigram x andhence has the character of an association measure.This allows us to compare it with other associationmeasures by the same means of precision-recallcurves and mean average precision. Further, wepresent several classification methods and demon-strate how they can be employed for ranking, i.e.what function can be used as a ranker. For refer-ences see Venables and Ripley (2002).

4.1 Linear logistic regressionAn additive model for binary response is repre-sented by a generalized linear model (GLM) ina form of logistic regression:

logit(π) = β0 + β1x1 + . . . + βpxp

655

Direct comparison of AM scores

Pecina & Schlesinger (2006) perform asystematic comparison

Main result: severalgroups of highlycorrelated or evenvirtually identical AMs

37

Empirical studies: MWE evaluation

38

AM are used for ranking candidates in MWE extraction tasks

Evaluation in terms of precision of n-best lists

Gold standard■ expert judgements of

“usefulness” (for app.)■ linguistically defined

(subtypes of) MWE■ always requires manual

annotation of data!0 1000 2000 3000 4000 5000

010

2030

4050

n−best list

prec

isio

n (%

)

baseline = 11.09%

t−score

Empirical studies: MWE evaluation

39

German PP-verb pairs from FR corpus (f ≥ 30)

MWE annotated by Brigitte Krenn (2000)■ Funktionsvergefüge (FVG)■ figurative expressions

Data & guidelines: www.collocations.de

0 1000 2000 3000 4000 5000

010

2030

4050

n−best list

prec

isio

n (%

)

baseline = 11.09%

t−scorePoissonz−scorefrequencyMI

0 20 40 60 80 100

010

20

30

40

50

60

German PNV

Recall (%)

Pre

cis

ion

(%

)

G2

X2

t

MI

Dice

f

http://multiword.sf.net/mwe2008/

MWE 2008 Shared Task: DE-PNV

Shared task on German V+PP■ FVG■ figurative

Frequency data from FR, chunk parsed, f ≥ 30

Baseline:11.09%

Best AM: t-scoreAP = 39.79%

Frequency:AP = 33.88%

40

Page 11: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

0 20 40 60 80 100

010

20

30

40

50

60

English VPC

Recall (%)

Pre

cis

ion

(%

)

G2

X2

t

MI

Dice

f

http://multiword.sf.net/mwe2008/

MWE 2008 Shared Task: EN-VPC

Shared task on English particle verbs (VPC)

Frequency data from full BNC■ adjacent pairs

Baseline:14.29%

Best AM: t-scoreAP = 29.94%

Frequency:AP = 29.01%

41

0 20 40 60 80 100

10

20

30

40

50

60

70

German AN

Recall (%)

Pre

cis

ion

(%

)

G2

X2

t

MI

Dice

f

http://multiword.sf.net/mwe2008/

MWE 2008 Shared Task: DE-AN

Shared task on German Adj+N■ evaluated by

professional lexicographers

Frequency data from FR corpus

Baseline:41.53%

Best AM: DiceAP = 58.84%

Frequency:AP = 46.90%

42

0 20 40 60 80 100

20

30

40

50

60

70

80

Czech MWE Bigrams

Recall (%)

Pre

cis

ion

(%

)

G2

X2

t

MI

Dice

f

http://multiword.sf.net/mwe2008/

MWE 2008 Shared Task: CZ-MWE

Shared task on Czech MWE■ evaluated by

lexicographers■ three judges

Baseline:21.03%

Best AM: chi-sq.AP = 64.86%

Frequency:AP = 21.70%

43

See Evert (2004, 2008) for details

EO

0.001 0.01 0.1 1 10

1010

0

t−score true MWE

Geometric visualisation of AMs

44

Page 12: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

See Evert (2004, 2008) for details

Expected Frequency

Obs

erve

d Fr

eque

ncy

1e−03 1e−02 1e−01 1e+00 1e+01

510

2050

100

t−scoren = 20n = 50n = 100n = 200

Geometric visualisation of AMs

45

Expected Frequency

Obs

erve

d Fr

eque

ncy

1e−03 1e−02 1e−01 1e+00 1e+01

510

2050

100

MIn = 20n = 50n = 100n = 200

Expected Frequency

Obs

erve

d Fr

eque

ncy

1e−03 1e−02 1e−01 1e+00 1e+01

510

2050

100

Poissonn = 20n = 50n = 100n = 200

Expected Frequency

Obs

erve

d Fr

eque

ncy

1e−03 1e−02 1e−01 1e+00 1e+01

510

2050

100

t−scoren = 20n = 50n = 100n = 200

Expected Frequency

Obs

erve

d Fr

eque

ncy

1e−03 1e−02 1e−01 1e+00 1e+01

510

2050

100

z−scoren = 20n = 50n = 100n = 200

Geometric visualisation of AMs

46

Geometric visualisation of AMs

47

Evaluation & visualisation combined

48

Page 13: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Evaluation & visualisation combined

49

Evaluation & visualisation combined

50

51

Room for improvement

Results from MWE 2008 Shared Task: DE-PNV

Machine learning (Pecina & Schlesinger 2006)

52

0

10

20

30

40

50

60

70

Baseline G2 X2 t-score MI Dice freq PE EPI Best AM ML

60.6

43.9

22.7

14.6

33.9

20.1

24.4

39.8

30.0

39.1

11.1

Page 14: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Results from MWE 2008 Shared Task: CZ-MWE

Machine learning (Pecina & Schlesinger 2006)

53

0

10

20

30

40

50

60

70

80

Baseline G2 X2 t-score MI Dice freq PE EPI Best AM ML

79.5

65.6

21.7

62.664.9

24.6

65.0

42.5

21.0

Results from MWE 2008 Shared Task: DE-AN

Machine learning (Pecina & Schlesinger 2006)

54

0

10

20

30

40

50

60

70

Baseline G2 X2 t-score MI Dice freq PE EPI Best AM ML

61.362.9

40.4

46.9

58.8

54.752.4

57.156.1

41.5

Upper limits: overtraining

55

What is the highest precision that a "sensible" AM can achieve in principle?

Like a highly over-trained machine learning approach

Restriction needed: simple AM

0 20 40 60 80 100

020

4060

8010

0

recall (%)

prec

isio

n (%

)

baseline = 11.09%

t−scorePoissonz−scorefrequencyMI

Upper limits: optimal simple AM

56

E

O

0.001 0.01 0.1 1 10

10100

t−scorez−scoreoptimal

E

O

0.001 0.01 0.1 1 10

10100

t−scorez−scoreoptimal

Page 15: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Do AMs scale up to the Web?

57E

O

0.01 0.1 1 10 100 10000 1e+06 1e+08

0.1

110

100

1000

01e

+06

1e+0

8

!

!

!

!

!

!

!

!

!

!

!!

!

!

! !

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!!

!

!!

!

! !

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!!

!!!

!

!

!

!

!

!!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!!

!

!!

!

!

!!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!!!

!

!

!

!

!

!!

!!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

! !

!

!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)

Do AMs scale up to the Web?

58E

O

0.01 0.1 1 10 100 10000 1e+06 1e+08

0.1

110

100

1000

01e

+06

1e+0

8

!

!

!

!

!

!

!

!

!

!

!!

!

!

! !

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!!

!

!!

!

! !

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!!

!!!

!

!

!

!

!

!!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!!

!

!!

!

!

!!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!!!

!

!

!

!

!

!!

!!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

! !

!

!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)

t

Do AMs scale up to the Web?

59E

O

0.01 0.1 1 10 100 10000 1e+06 1e+08

0.1

110

100

1000

01e

+06

1e+0

8

!

!

!

!

!

!

!

!

!

!

!!

!

!

! !

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!!

!

!!

!

! !

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!!

!!!

!

!

!

!

!

!!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!!

!

!!

!

!

!!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!!!

!

!

!

!

!

!!

!!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

! !

!

!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)

ttG2

Do AMs scale up to the Web?

60E

O

0.01 0.1 1 10 100 10000 1e+06 1e+080.

11

1010

010

000

1e+0

61e

+08

!

!

!

!

!

!

!

!

!

!

!!

!

!

! !

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!!

!

!!

!

! !

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!!

!!!

!

!

!

!

!

!!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!!

!

!!

!

!

!!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!!!

!

!

!

!

!

!!

!!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

! !

!

!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)

ttG2tG2

MIMI3

Page 16: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Do AMs scale up to the Web?

61E

O

0.01 0.1 1 10 100 10000 1e+06 1e+08

0.1

110

100

1000

01e

+06

1e+0

8

!

!

!

!

!

!

!

!

!

!

!!

!

!

! !

!!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

! !

!!

!

!

!

!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!!

!

!!

!

! !

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!!

!!!

!

!

!

!

!

!!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!!

!

!!

!

!

!!

!

!

!

!!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!!

!

!

!

!

!!

!!!

!

!

!

!

!

!!

!!

!

!

!

!

!!

!

!

!

!

!

!

!!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!!

!

!

!

!

!

! !

!

!

!!

!

!

!

!

!

!

!!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

! !

!

!!

!

!

!

!

!!

!

!

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)

tG2

MIMI3

62

What else?

63

Some current research topics (my agenda)

Optimised AMs for specific types of tasks and data sets■ e.g. for identification of SVC vs. idioms■ for very small or very large corpora, skewed frequency dist's

Extension to combinations of three or more words■ particularly important for MWE, but also empirical collocations■ basis for higher-order distributional semantics (tensors)

Asymmetric association measures (Michelbacher et al. 2011)■ e.g. wellington boot, bated breath, high fidelity■ virtually all statistical AM are symmetric

Collocational patterns: productivity of collocations■ integration of collocations with distributional similarity

(A)symmetry of association

64

Collocations are often asymmetric (Kjellmer 1991)■ e.g. wellington boot, bated breath, high fidelity■ bated breath is “right-predictive”, high fidelity is “left-predictive”■ effect may in part be due to frequency of collocates

Well-known fact, but little research in linguistics & NLP■ MWE and semantic relations are inherently symmetric■ most sensible measures of 1st- and 2nd-order statistical

association are also symmetric■ including all association measures mentioned in this talk

Page 17: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

Asymmetric association measures

Mathematically founded derivations lead to symmetric AM■ how can asymmetry of association be accounted for?

Michelbacher et al. (2007): forward vs. backward rank

65

“bated” (log-likelihood)“bated” (log-likelihood)“bated” (log-likelihood)collocate score rankbreath 339.10 1with 99.75 2

waited 75.02 3waiting 50.91 4

and 0.88 5, 0.00 6. -0.11 7

the -0.91 8

“breath” (log-likelihood)“breath” (log-likelihood)“breath” (log-likelihood)collocate score rank

deep 6787.38 1took 3207.68 2her 2812.36 3his 2100.52 4… … …

shuddering 399.37 19bated 376.96 20draw 343.53 21

Asymmetric association measures

Michelbacher et al. (2007):forward vs. backward rank

Asymmetric AM (AAM): score = difference between forward & backward rank

Various AAM can be defined (one for each symmetric AM)

Plot shows distribution of forward and backward ranks■ based on log-likelihood AM■ for symmetric A, largest bars

would be on the diagonal

66

1−23−56−1011−20

21−3536−60

61−100101−160

161−250

251−500

501+501+

251−500161−250

101−16061−100

36−6021−35

11−206−10

3−51−2

Forward Rank

Backward Rank

Num

ber o

f Pai

rs

AAM evaluation results (work in progress)

Free associations are often asymmetric

Michelbacher et al. (2007) evaluate AAM on USF free A norms

Results are inconlusive■ presumably because

free association norms are mostly based on paradigmatic relations

■ 1st-order statistical A is syntagmatic (is it?)

67

USF free associationsUSF free associationsUSF free associationsUSF free associations

cue target fwd A bwd A

boys girls 0.500 0.503

bad good 0.750 0.758

dinner supper 0.535 0.545

trout fish 0.913 0.036

saddle horse 0.879 0.103

crib baby 0.842 0.032

exhausted tired 0.895 0.075

bank money 0.799 0.019

bouquet flowers 0.828 0.053

AAM evaluation results (work in progress)

Evaluation on new data set of free syntagmatic A■ similar to free A

norms, but asksfor syntagmaticcombination

Michelbacher et al. (2011)■ fwd/bwd ranks

for different AM■ compared to

syntagmatic A

68

Michelbacher, Evert and Schütze Asymmetry in Word Associations

Table 5Forward and backward scores and rank measures for a subset of the word pairs used in theelicitation experiment

f b (w1, w2) R→f R←f R→G2 R←

G2 R→t R←t R→χ2 R←

χ2

group A: rank measures and direction scores conform

0.5891 0.2545 Academy Award 1 9 1 2 1 7 1 20.3328 0.0010 ancestral home 1 25 1 13 1 19 1 110.5551 0.1609 cable television 2 7 1 4 2 5 1 20.0127 0.0087 cut glass 1 75 1 46 1 58 1 400.6760 0.0010 felled tree 1 54 1 33 1 45 1 170.0683 0.0021 hunched shoulders 1 16 1 7 1 14 1 20.0875 0.0010 old-fashioned way 1 98 1 60 1 62 4 700.1667 0.0063 rightful place 1 26 1 6 1 15 2 40.1500 0.0496 rope ladder 1 4 1 4 1 4 2 40.0241 0.0010 shrewd idea 3 109 6 49 3 68 10 410.1719 0.0010 thick-set man 1 519 1 169 1 318 1 860.0641 0.0068 well-worn path 1 71 1 34 1 58 1 220.0127 0.0125 *impending retirement 9 18 8 14 9 18 9 90.0606 0.0563 *speech recognition 1 2 1 1 1 2 2 40.0010 0.0099 annual rent 29 2 20 1 28 2 15 70.0266 0.8208 Christmas decorations 11 1 8 1 11 1 9 30.0010 0.0101 female preferences 63 34 92 44 60 34 95 480.0010 0.0650 hard frost 39 1 21 1 35 1 10 10.0010 0.0312 legal wrangling 151 1 58 1 110 1 25 10.0081 0.1325 smoked mackerel 5 1 3 1 5 1 3 10.0010 0.0031 southern bypass 21 1 15 1 20 1 10 10.0046 0.0426 welcome diversion 17 3 15 1 16 3 8 30.0032 0.0425 *bond issuance 10 1 7 1 10 1 2 20.0549 0.4500 *deadly nightshade 7 1 3 1 7 1 1 10.0478 0.3893 *white collar 11 1 6 1 9 1 5 6

group B: rank measures and direction scores do not conform

0.0955 0.1160 healthy food 6 19 6 20 5 15 13 530.1562 0.1543 missile silos 16 1 8 1 16 1 2 10.0010 0.0063 seasoned campaigners 1 9 1 6 1 9 1 6

group C: rank measures ambivalent

0.5411 0.4620 epileptic seizure 2 3 2 1 2 3 2 10.0761 0.0335 dedicated follower 7 3 2 3 4 3 3 60.4340 0.0237 laboratory experiments 2 1 1 1 2 1 1 10.0683 0.1836 South East 1 2 3 2 1 2 5 2

group D: high mutual predictiveness

0.2962 0.1337 bloody hell 1 1 1 1 1 1 1 10.1275 0.2833 *special needs 1 1 1 1 1 1 1 20.6810 0.2793 toxic waste 1 1 1 1 1 1 1 10.2613 0.1583 unleaded petrol 1 1 1 1 1 1 1 10.9521 0.0068 wishful thinking 1 1 1 1 1 1 1 1

17

Page 18: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

AAM evaluation results (work in progress)

Some results good■ previous slide

Other results are less encouraging■ AAM are unclear

or contradict syntagmatic A

wishful thinking■ fwd/bwd rank 1

for all AM■ right-predictive

in human data

69

Michelbacher, Evert and Schütze Asymmetry in Word Associations

Table 5Forward and backward scores and rank measures for a subset of the word pairs used in theelicitation experiment

f b (w1, w2) R→f R←f R→G2 R←

G2 R→t R←t R→χ2 R←

χ2

group A: rank measures and direction scores conform

0.5891 0.2545 Academy Award 1 9 1 2 1 7 1 20.3328 0.0010 ancestral home 1 25 1 13 1 19 1 110.5551 0.1609 cable television 2 7 1 4 2 5 1 20.0127 0.0087 cut glass 1 75 1 46 1 58 1 400.6760 0.0010 felled tree 1 54 1 33 1 45 1 170.0683 0.0021 hunched shoulders 1 16 1 7 1 14 1 20.0875 0.0010 old-fashioned way 1 98 1 60 1 62 4 700.1667 0.0063 rightful place 1 26 1 6 1 15 2 40.1500 0.0496 rope ladder 1 4 1 4 1 4 2 40.0241 0.0010 shrewd idea 3 109 6 49 3 68 10 410.1719 0.0010 thick-set man 1 519 1 169 1 318 1 860.0641 0.0068 well-worn path 1 71 1 34 1 58 1 220.0127 0.0125 *impending retirement 9 18 8 14 9 18 9 90.0606 0.0563 *speech recognition 1 2 1 1 1 2 2 40.0010 0.0099 annual rent 29 2 20 1 28 2 15 70.0266 0.8208 Christmas decorations 11 1 8 1 11 1 9 30.0010 0.0101 female preferences 63 34 92 44 60 34 95 480.0010 0.0650 hard frost 39 1 21 1 35 1 10 10.0010 0.0312 legal wrangling 151 1 58 1 110 1 25 10.0081 0.1325 smoked mackerel 5 1 3 1 5 1 3 10.0010 0.0031 southern bypass 21 1 15 1 20 1 10 10.0046 0.0426 welcome diversion 17 3 15 1 16 3 8 30.0032 0.0425 *bond issuance 10 1 7 1 10 1 2 20.0549 0.4500 *deadly nightshade 7 1 3 1 7 1 1 10.0478 0.3893 *white collar 11 1 6 1 9 1 5 6

group B: rank measures and direction scores do not conform

0.0955 0.1160 healthy food 6 19 6 20 5 15 13 530.1562 0.1543 missile silos 16 1 8 1 16 1 2 10.0010 0.0063 seasoned campaigners 1 9 1 6 1 9 1 6

group C: rank measures ambivalent

0.5411 0.4620 epileptic seizure 2 3 2 1 2 3 2 10.0761 0.0335 dedicated follower 7 3 2 3 4 3 3 60.4340 0.0237 laboratory experiments 2 1 1 1 2 1 1 10.0683 0.1836 South East 1 2 3 2 1 2 5 2

group D: high mutual predictiveness

0.2962 0.1337 bloody hell 1 1 1 1 1 1 1 10.1275 0.2833 *special needs 1 1 1 1 1 1 1 20.6810 0.2793 toxic waste 1 1 1 1 1 1 1 10.2613 0.1583 unleaded petrol 1 1 1 1 1 1 1 10.9521 0.0068 wishful thinking 1 1 1 1 1 1 1 1

17

70

Thank you!

References

71

Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.

Burger, Harald; Buhofer, Annelies; Sialm, Ambros (1982). Handbuch der Phraseologie. De Gruyter, Berlin, New York.

Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Dissertation, Institut für maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93-opus-23714. Also see online resources at http://www.collocations.de/.

Evert, Stefan (2008). Corpora and collocations. In A.!Lüdeling and M.!Kytö (eds.), Corpus Linguistics. An International Handbook, chapter!58. Mouton de Gruyter, Berlin.

Evert, Stefan and Kermes, Hannah (2003). Experiments on candidate data for collocation extraction. In Companion Volume to the Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, pages 83–86.

Evert, Stefan and Krenn, Brigitte (2001). Methods for the qualitative evaluation of lexical association measures. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pages 188–195, Toulouse, France.

Evert, Stefan and Krenn, Brigitte (2005). Using small random samples for the manual evaluation of statistical association measures. Computer Speech and Language, 19(4), 450–466.

Evert, Stefan and Krenn, Brigitte (2005a). Exploratory collocation extraction. Presentation at the Phraseology 2005 Conference, Louvain-la-Neuve, Belgium.

Firth, J.!R. (1957). A synopsis of linguistic theory 1930–55. In Studies in linguistic analysis, pages 1–32. The Philological Society, Oxford.

Harris, Zellig (1954). Distributional structure. Word, 10(23), 146–162.

Hoey, Michael (2005). Lexical Priming: A new theory of words and language. Routledge, London.

Hoffmann, Sebastian; Evert, Stefan; Smith, Nicholas; Lee, David; Berglund!Prytz, Ylva (2008). Corpus Linguistics with BNCweb – a Practical Guide, volume!6 of English Corpus Linguistics. Peter Lang, Frankfurt am Main.

References

72

Kilgarriff, Adam and Tugwell, David (2002). Sketching words. In M.-H. Corréard (ed.), Lexicography and Natural Language Processing: A Festschrift in Honour of B. T. S. Atkins, pages 125–137. EURALEX.

Kilgarriff, Adam; Rychly, Pavel; Smrz, Pavel; Tugwell, David (2004). The sketch engine. In Proceedings of the 11th EURALEX International Congress, Lorient, France.

Kjellmer, Göran (1991). A mint of phrases. In K.!Aijmer and B.!Altenberg (eds.), English Corpus Linguistics, pages 111–127. Longman, London.

Krenn, Brigitte (2000). The Usual Suspects: Data-Oriented Models for the Identification and Representation of Lexical Collocations, Dissertation. DFKI & Universität des Saarlandes, Saarbrücken, Germany.

Landauer, Thomas!K. and Dumais, Susan!T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2), 211–240.

Lea, Diana (ed.) (2002). Oxford Collocations Dictionary for students of English. Oxford University Press, Oxford, New York.

Lin, Dekang (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL 1998), pages 768–774, Montreal, Canada.

Michelbacher, Lukas; Evert, Stefan; Schütze, Hinrich (2007). Asymmetric association measures. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2007), Borovets, Bulgaria.

Michelbacher, Lukas; Evert, Stefan; Schütze, Hinrich (2011). Asymmetry in corpus-derived and human word associations. Corpus Linguistics and Linguistic Theory, 7(2), 245–276.

Pecina, Pavel (2005). An extensive empirical study of collocation extraction methods. In Proceedings of the ACL Student Research Workshop, pages 13–18, Ann Arbor, MI.

Pecina, Pavel and Schlesinger, Pavel (2006). Combining association measures for collocation extraction. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), Poster Sessions, pages 651–658, Sydney, Australia. ACL.

Rapp, Reinhard (2004). A freely available automatically generated thesaurus of related words. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), pages 395–398.

Page 19: Statistical Association and Multiword Expressions (Firth ... · white dwarf apple juice skeleton key bucket and spade fish and chips Red Cross New York City Scales of MWE-ness 4

References

73

Sahlgren, Magnus (2006). The Word Space Model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Department of Linguistics, Stockholm University.

Schütze, Hinrich (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–123.

Sinclair, John!McH. (1966). Beginning the study of lexis. In C.!E. Bazell, J.!C. Catford, M.!A.!K. Halliday, and R.!H. Robins (eds.), In Memory of J. R. Firth, pages 410–430. Longmans, London.

Sinclair, John; Jones, Susan; Daley, Robert; Krishnamurthy, Ramesh (1970/2004). English Collocation Studies: The OSTI Report. Continuum Books, London and New York. Originally written in 1970 (unpublished).

Sinclair, John (1991). Corpus, Concordance, Collocation. Oxford University Press, Oxford.

Sinclair, John (ed.) (1995). Collins COBUILD English Dictionary. Harper Collins, London. New edition, completely revised.