statistical association and multiword expressions (firth ... · white dwarf apple juice skeleton...
TRANSCRIPT
Statistical Association and Multiword Expressions
Stefan EvertInstitute of Cognitive Science
University of Osnabrück
[email protected] | purl.org/stefan.evert
RU Groningen | 10.05.2011
multiword expressions (phraseology)
Collocations vs. multiword expressions
2
empirical collocations(Firth / Sinclair)
• based on native speaker judgments
• linguistic criteria/tests• different subtypes• still no precise def. &
subtype classification• “grey area” between
idiomatic MWE and free combinations
• quantitative empirical phenomenon
• based on corpus data• linguistic status?
words collocate for different “reasons”
• unclear: appropriate operationalisation (“collocation measure”)
multiword expressions
Types & examples of multiword expressions
3
idioms
figurativeexpressions lexical
collocations
light verbs(SVC, FVG)
complex lexicalitems (MWU)
Englishnoun compounds
namedentities
particle verbs(VPC)
(multiword)terminology
institutionalisedphrases & clichés
kick the bucketeat humble pie
brush one's teethmake teaheavy smoker
give uphand in
in front ofdéjà vuto and fro
give a talkdraw a conclusion
spill the beans
law of mass actionwhite dwarf
apple juiceskeleton key
bucket and spadefish and chips
Red Cross New York City
Scales of MWE-ness
4
compositionalitysemantic dimension
modifiabilitysyntactic dimension
substitutabilitylexical dimension
compositional syntax
opaqueidiom
decomposablemetaphor
semi-compositional
pragmatic aspects
rigidMWU
n-gram
limited variability
semi-fixed constructionmorphosyntacticpreferences
LWC
completelydetermined
(no substitution)
productive MWEpattern
selectionalrestrictions
partlydetermined
multiword expressions
Types & examples of multiword expressions
5
idioms
figurativeexpressions lexical
collocations
light verbs(SVC, FVG)
complex lexicalitems (MWU)
Englishnoun compounds
namedentities
particle verbs(VPC)
(multiword)terminology
institutionalisedphrases & clichés
Produced with UCS toolkit | http://www.collocations.de/software.html
Examples of collocations (BNC)
6
bucketbucketbucketcollocate f G2
water 183 1064.079
spade 31 341.138
bucket 34 306.078
plastic 36 243.863
slop 15 213.303
mop 18 207.117
size 42 200.162
fill 38 195.749
record 42 174.693
throw 35 172.264
into 87 149.409
empty 18 148.807
with 191 147.546
ice 22 131.697
randomize 9 115.335
kick 19 113.238
of 488 81.765
single-record 5 81.107
large 36 80.852
shop 23 80.794
seat 20 78.645
roseroserosecollocate f G2
red 209 1113.501
shrub 68 572.793
hilaire 46 561.504
garden 133 548.936
cottage 75 418.288
bowl 66 389.237
petal 42 361.140
bush 65 324.711
net 63 321.964
white 104 319.160
pink 54 299.730
rose 66 285.331
mid-term 27 276.903
gun 60 271.541
axl 20 269.491
mary 64 265.474
wild 58 257.805
flower 63 251.148
per 74 231.465
miss 62 225.623
floyd 26 218.556
Produced with UCS toolkit | http://www.collocations.de/software.html
Examples of collocations (BNC)
7
bucket: nounsbucket: nounsbucket: nounscollocate f G2
water 183 1064.079
spade 31 341.138
bucket 34 306.078
plastic 36 243.863
slop 15 213.303
mop 18 207.117
size 42 200.162
record 42 174.693
ice 22 131.697
shop 23 80.794
seat 20 78.645
sand 13 68.814
brigade 10 67.080
shovel 7 64.335
coal 14 63.609
oats 7 62.659
rhino 7 60.813
champagne 10 59.556
density 10 59.132
algorithm 8 57.552
container 9 54.561
bucket: verbsbucket: verbsbucket: verbscollocate f G2
fill 38 195.749
throw 35 172.264
empty 18 148.807
randomize 9 115.335
kick 19 113.238
put 38 66.174
hold 31 62.765
tip 10 61.670
carry 25 59.554
fetch 9 52.665
chuck 7 50.638
store 10 48.327
pour 10 47.206
weep 7 43.396
douse 4 37.842
used 13 31.791
pack 7 29.582
use 33 28.469
slop 3 27.238
drop 10 26.855
clean 7 26.830
bucket: adjectivesbucket: adjectivesbucket: adjectivescollocate f G2
single-record 5 81.107
large 36 80.852
cold 17 63.644
galvanized 4 51.373
full 22 49.746
steaming 4 32.883
leaky 3 29.520
empty 8 28.670
bottomless 3 28.397
galvanised 3 27.186
soggy 3 25.022
iced 3 24.535
small 20 24.033
clean 7 23.416
bowed 2 20.506
omnipresent 2 19.811
anglo-saxon 3 18.219
wooden 5 17.251
ice-cold 2 17.211
soapy 2 16.005
ten 10 15.864
http://beta.sketchengine.co.uk/
Word sketch
8
What are collocations?
9
bucket: nounsbucket: nounsbucket: nounscollocate f G2
water 183 1064.079
spade 31 341.138
bucket 34 306.078
plastic 36 243.863
slop 15 213.303
mop 18 207.117
size 42 200.162
record 42 174.693
ice 22 131.697
shop 23 80.794
seat 20 78.645
sand 13 68.814
brigade 10 67.080
shovel 7 64.335
coal 14 63.609
oats 7 62.659
rhino 7 60.813
champagne 10 59.556
density 10 59.132
algorithm 8 57.552
container 9 54.561
bucket: verbsbucket: verbsbucket: verbscollocate f G2
fill 38 195.749
throw 35 172.264
empty 18 148.807
randomize 9 115.335
kick 19 113.238
put 38 66.174
hold 31 62.765
tip 10 61.670
carry 25 59.554
fetch 9 52.665
chuck 7 50.638
store 10 48.327
pour 10 47.206
weep 7 43.396
douse 4 37.842
used 13 31.791
pack 7 29.582
use 33 28.469
slop 3 27.238
drop 10 26.855
clean 7 26.830
bucket: adjectivesbucket: adjectivesbucket: adjectivescollocate f G2
single-record 5 81.107
large 36 80.852
cold 17 63.644
galvanized 4 51.373
full 22 49.746
steaming 4 32.883
leaky 3 29.520
empty 8 28.670
bottomless 3 28.397
galvanised 3 27.186
soggy 3 25.022
iced 3 24.535
small 20 24.033
clean 7 23.416
bowed 2 20.506
omnipresent 2 19.811
anglo-saxon 3 18.219
wooden 5 17.251
ice-cold 2 17.211
soapy 2 16.005
ten 10 15.864
idiom compound technical lexical collocation semantic relation facts of life
Multiword Expressions (MWE)
Why collocations are important
10
Primary tool for MWE identification■ e.g. Evert/Krenn (2001, 2005) | MWE Workshops & Shared Task
Language description: approximation of word meaning■ Firth (1957) | Sinclair (1991) | computational lexicography
Psycholinguistic relevance: priming & syntactic associates■ priming effects | lexical priming (Hoey 2005) | link grammar etc.
Collostructions, subcategorisation & selectional preferences■ "collocations" between words & syntactic patterns
Applications in NLP, e.g. long-distance adaptors for LM
Basis of distributional semantic models (term-term matrix)
Key questions for MWE and collocations
11
Linguistic definition of MWE and their subtypes
Relation between (different subtypes of) MWE and(different quantitative notions of) empirical collocations
Operationalisation of empirical collocations andappropriate quantitative measures
12
Co-occurrence and statistical association
13
Operationalising collocations
Early "definitions"■ recurrent, habitual word combinations (Firth 1957)■ greater than chance co-occurrence (Sinclair 1966, 1970)■ significant collocations (Kilgarriff & Tugwell 2002)
Ingredient 1: co-occurrence■ surface vs. textual vs. syntactic (Evert 2004, 2008)■ contingency tables of joint & marginal frequencies
Ingredient 2: statistical association■ quantitative measure for tendency of events to co-occur■ operationalises intuition of recurrent, “salient” combinations
Collocational span of 4 words (L4, R4), limited by sentence boundaries
Surface co-occurrence
14
A vast deal of coolness and a peculiar degree of judgement, are requisite in catching a hat . A man must
not be precipitate, or he runs over it ; he must not rush into the opposite extreme, or he loses it
altogether. [. . . ] There was a fine gentle wind, and Mr. Pickwick’s hat rolled sportively before it . The
wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolled over and over as merrily as a lively porpoise
in a strong tide ; and on it might have rolled, far beyond Mr. Pickwick’s reach, had not its course been
providentially stopped, just as that gentleman was on the point of resigning it to its fate.
f(hat, roll) = 2
Co-occurrence within sentences
Textual co-occurrence
15
f(hat, over) = 1
A vast deal of coolness and a peculiar degree of judgement, arerequisite in catching a hat.
hat —
A man must not be precipitate, or he runs over it ; — over
he must not rush into the opposite extreme, or he loses italtogether.
— —
There was a fine gentle wind, and Mr. Pickwick’s hat rolledsportively before it.
hat —
The wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolledover and over as merrily as a lively porpoise in a strong tide ;
hat over
Adjectival noun modification (prenominal adjectives)
Syntactic co-occurrence
16
f(young, gentleman) = 1
In an open barouche [. . . ] stood a stout old gentleman, in a blue coat
and bright buttons, corduroy breeches and top-boots ; two
young ladies in scarfs and feathers ; a young gentleman apparently
enamoured of one of the young ladies in scarfs and feathers ; a lady
of doubtful age, probably the aunt of the aforesaid ; and [. . . ]
!
open barouchestout gentleman
old gentlemanblue coat
bright buttonyoung ladyyoung gentlemanyoung lady
doubtful age
Observed frequency
17
Collocations: “recurrent” combinations→ simply use co-occurrence frequency as measure of salience?
Example: most frequent adjacent bigrams from Brown corpus
Frequent combinations don't seemto be very interesting collocations
Mathematical reason:■ f(is to) = 260■ f(is) ≈ 10,000, f(to) ≈ 26,000■ one would expect 260 co-occurrences
if words were ordered randomly!
adjacent bigrams (Brown)adjacent bigrams (Brown)adjacent bigrams (Brown)bigram f rank
of the 9702 1
in the 6018 2
to the 3478 3
on the 2459 4
and the 2242 5
for the 1845 6
to be 1715 7
at the 1654 8
with the 1530 9
it is 1482 10
of a 1469 11
in a 1413 12
from the 1410 13
that the 1378 14
by the 1347 15
it was 1338 16
he was 1110 17
as a 980 18
he had 933 19
… … …
is to 260 133
adjacent bigrams (Brown)adjacent bigrams (Brown)adjacent bigrams (Brown)adjacent bigrams (Brown)bigram f expected rank
of the 9702 2186.75 1
in the 6018 1260.22 2
to the 3478 1613.01 3
on the 2459 384.84 4
and the 2242 1768.75 5
for the 1845 571.23 6
to be 1715 173.16 7
at the 1654 323.29 8
with the 1530 427.89 9
it is 1482 87.02 10
of a 1469 759.86 11
in a 1413 437.91 12
from the 1410 258.53 13
that the 1378 650.83 14
by the 1347 322.97 15
it was 1338 86.32 16
he was 1110 99.98 17
as a 980 155.42 18
he had 933 53.02 19
… … … …
is to 260 266.61 133
Observed & expected frequency
18
Collocations: “recurrent” combinations→ use co-occurrence frequency as measure of salience
Example: most frequent adjacent bigrams from Brown corpus
Frequent combinations don't seemto be very interesting collocations
Mathematical reason:■ f(is to) = 260■ f(is) ≈ 10,000, f(to) ≈ 26,000■ one would expect 260 co-occurrences
if words were ordered randomly!
Contingency table = cross-classification of “items”■ mathematical basis for concept of statistical association
Statistics tells us how to calculate expected cell counts
Observed & expected contingency tables
19
w2 ¬w2 w2 ¬w2
w1 O11 O12 ! R1 w1 E11 !
R1C1
NE12 !
R1C2
N
¬w1 O21 O22 ! R2 ¬w1 E21 !
R2C1
NE22 !
R2C2
N
! C1 ! C2 ! N
observed expected
Item = sentence (or other text segment)
Textual co-occurrence
20
f(hat, over) = 1sample size N = 5
A vast deal of coolness and a peculiar degree of judgement, arerequisite in catching a hat.
hat —
A man must not be precipitate, or he runs over it ; — over
he must not rush into the opposite extreme, or he loses italtogether.
— —
There was a fine gentle wind, and Mr. Pickwick’s hat rolledsportively before it.
hat —
The wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolledover and over as merrily as a lively porpoise in a strong tide ;
hat over
over ¬over
hat
¬hat
O11 O12 R1
O21 O22 R2
C1 C2 N
Item = sentence (or other text segment)
Textual co-occurrence
21
f(hat, over) = 1sample size N = 5
A vast deal of coolness and a peculiar degree of judgement, arerequisite in catching a hat.
hat —
A man must not be precipitate, or he runs over it ; — over
he must not rush into the opposite extreme, or he loses italtogether.
— —
There was a fine gentle wind, and Mr. Pickwick’s hat rolledsportively before it.
hat —
The wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolledover and over as merrily as a lively porpoise in a strong tide ;
hat over
over ¬over
hat
¬hat
1 2 3
1 1 2
2 3 5
Item = instance of adjective-noun modification
Syntactic co-occurrence
22
f(young, gentleman) = 1sample size N = 9
In an open barouche [. . . ] stood a stout old gentleman, in a blue coat
and bright buttons, corduroy breeches and top-boots ; two
young ladies in scarfs and feathers ; a young gentleman apparently
enamoured of one of the young ladies in scarfs and feathers ; a lady
of doubtful age, probably the aunt of the aforesaid ; and [. . . ]
!
open barouchestout gentleman
old gentlemanblue coat
bright buttonyoung ladyyoung gentlemanyoung lady
doubtful age
•|gent. •|¬gent
young|•
¬young|•
1 2 3
2 4 6
3 6 9
Item = token
Surface co-occurrence
23
A vast deal of coolness and a peculiar degree of judgement, are requisite in catching a hat . A man must
not be precipitate, or he runs over it ; he must not rush into the opposite extreme, or he loses it
altogether. [. . . ] There was a fine gentle wind, and Mr. Pickwick’s hat rolled sportively before it . The
wind pu!ed, and Mr. Pickwick pu!ed, and the hat rolled over and over as merrily as a lively porpoise
in a strong tide ; and on it might have rolled, far beyond Mr. Pickwick’s reach, had not its course been
providentially stopped, just as that gentleman was on the point of resigning it to its fate.
f(hat, roll) = 2sample size N = 108
roll ¬roll
NEAR(hat)
¬NEAR(hat)
2 18 20
1 87 88
3 105 108
Data from BNC | RASP parser: http://www.informatics.susx.ac.uk/research/nlp/rasp/
Comparison
24
"sell" - V+Obj"sell" - V+Objcollocate G2
goods 1391.9
share 1231.6
product 946.0
house 665.4
property 602.1
land 478.0
ticket 453.3
asset 413.7
copy 399.4
car 306.8
business 246.8
stock 224.1
stake 205.5
home 174.3
liquor 167.0
soul 166.0
bond 141.9
produce 138.3
company 110.8
unit 105.5
painting 105.2
"sell" - (L0, R2)"sell" - (L0, R2)collocate G2
share 1312.8
goods 1309.6
product 1051.1
house 707.3
ticket 611.0
property 460.9
land 388.4
copy 383.5
car 353.6
auction 276.3
soul 236.8
liquor 223.7
asset 182.1
produce 166.9
ware 156.5
bond 149.5
insurance 144.5
stake 131.2
stock 125.6
advertising 112.6
cigarette 103.9
"sell" - (L5, R5)"sell" - (L5, R5)collocate G2
goods 2177.7
product 1752.2
share 1749.0
copy 1341.3
shop 1232.0
ticket 1146.4
property 903.0
company 869.1
price 774.8
house 728.4
dealer 632.8
car 595.5
land 589.9
asset 572.4
market 524.0
stock 501.7
business 490.5
auction 449.7
stake 362.0
liquor 335.4
store 298.6
"sell" - sentence"sell" - sentencecollocate G2
price 3496.8
company 2661.1
market 2600.8
share 2593.6
goods 2435.3
product 2363.9
shop 1922.8
sale 1460.9
copy 1418.9
dealer 1381.7
property 1348.2
business 1347.8
sales 1171.4
stock 1149.4
ticket 1145.2
profit 1109.3
buyer 1076.7
house 1048.2
auction 916.3
owner 876.1
asset 873.6
Parsing accuracy
25
How reliable is syntactic co-occurrence?
Evert/Kermes (2003) evaluate adjective-noun identification■ German prenominal
adjectives■ TIGER Treebank used
as gold standard
Verb-object and verb-subject relations are much harder■ Charniak-Johnson parser achieves 89.3% (direct object) and
96.5% (subject) on examples sentences from English Wiktionary■ more difficult for languages with free word order (German)
perfect tagging TreeTagger tagging
candidates
from precision recall precision recall
adjacent pairs 98.47% 90.58% 94.81% 84.85%
window-based 97.14% 96.74% 93.85% 90.44%
YAC chunks 98.16% 97.94% 95.51% 91.67%
Table 1: Results for Adj+N extraction task
forward stochastic tagger and naive window-based
extraction precision and recall values above 90%
provide an excellent starting point for statistical
ranking.
The considerable differences between the sets
of pair types primarily affect hapaxes and have
little impact on statistical methods for colloca-
tion identification (where hapaxes are rarely found
among the higher-ranking candidates). Likewise,
small changes in the frequency profiles of more
frequent pairs have little impact on the association
scores and the precise ranking of the candidates. It
will be interesting to see how these results trans-
late to more demanding extraction tasks.
References
Kenneth W. Church and Patrick Hanks. 1990. Wordassociation norms, mutual information, and lexicog-raphy. Computational Linguistics, 16(1):22–29.
Beatrice Daille. 1994. Approche mixte pourl’extraction automatique de terminologie : statis-tiques lexicales et filtres linguistiques. Ph.D. thesis,Universite Paris 7.
Stefan Evert and Brigitte Krenn. 2001. Methods forthe qualitative evaluation of lexical association mea-sures. In Proceedings of the 39th Annual Meetingof the Association for Computational Linguistics,Toulouse, France.
Hannah Kermes and Stefan Evert. 2002. YAC – arecursive chunker for unrestricted german text. InManuel Gonzalez Rodriguez and Carmen PazSuarez
Araujo, editors, Proceedings of the Third Interna-tional Conference on Language Resources and Eval-uation (LREC), volume V, pages 1805–1812, LasPalmas, Spain.
Brigitte Krenn. 2000. The Usual Suspects: Data-Oriented Models for the Identification and Repre-sentation of Lexical Collocations. DFKI & Univer-sitat des Saarlandes, Saarbrucken.
Andreas Mengel and Wolfgang Lezius. 2000. AnXML-based representation format for syntacticallyannotated corpora. In Proceedings of the SecondInternational Conference on Language Resourcesand Engineering (LREC), volume 1, pages 121–126,Athens, Greece.
Helmut Schmid. 1994. Probabilistic part-of-speechtagging using decision trees. In International Con-ference on New Methods in Language Processing,pages 44–49, Manchester, UK.
Wojciech Skut, Brigitte Krenn, Thorsten Brants, andHans Uszkoreit. 1997. An annotation scheme forfree word order languages. In Proceedings of theFifth Conference on Applied Natural Language Pro-cessing ANLP-97, Washington, DC.
W. Skut, T. Brants, B. Krenn, and H. Uszkoreit. 1998.A linguistically interpreted corpus of german news-paper texts. In Proceedings of the ESSLLI Work-shop on Recent Advances in Corpus Annotation,Saarbrucken, Germany.
Frank Smadja. 1993. Retrieving collocations fromtext: Xtract. Computational Linguistics, 19(1):143–177.
See Evert (2004, 2008) for details | http://www.collocations.de/
Association measures (AM)
26
w2 ¬w2 w2 ¬w2
w1 O11 O12 ! R1 w1 E11 !
R1C1
NE12 !
R1C2
N
¬w1 O21 O22 ! R2 ¬w1 E21 !
R2C1
NE22 !
R2C2
N
! C1 ! C2 ! N
observed expected
Observed (O) vs. expected (E) co-occurrence frequency
Simple measures
27
MI ! log2
O
EMIk ! log2
Ok
Elocal-MI ! O · log2
O
E
z-score !O ! E"
Et-score !
O ! E"
Osimple-ll ! 2
!
O · logO
E! "O ! E#
"
Comparison of full contingency tables (observed vs. expected)
Statistical measures
28
MI ! log2
O
EMIk ! log2
Ok
Elocal-MI ! O · log2
O
E
z-score !O ! E"
Et-score !
O ! E"
Osimple-ll ! 2
!
O · logO
E! "O ! E#
"
chi-squared !!
ij
"Oij ! Eij#2
Eijchi-squaredcorr !
N"
|O11O22 ! O12O21| !N/2#2
R1R2C1C2
log-likelihood ! 2!
ij
Oij logOij
Eijaverage-MI !
!
ij
Oij · log2
Oij
Eij
Dice !2O11
R1 $ C1odds-ratio ! log
"
O11 $ 12
#"
O22 $ 12
#
"
O12 $ 12
#"
O21 $ 12
#
See Evert (2004, 2008) for details | http://www.collocations.de/
Association measures (AM)
29
recommendedmeasures(Evert 2008)
MI ! log2
O
EMIk ! log2
Ok
Elocal-MI ! O · log2
O
E
z-score !O ! E"
Et-score !
O ! E"
Osimple-ll ! 2
!
O · logO
E! "O ! E#
"
chi-squared !!
ij
"Oij ! Eij#2
Eijchi-squaredcorr !
N"
|O11O22 ! O12O21| !N/2#2
R1R2C1C2
log-likelihood ! 2!
ij
Oij logOij
Eijaverage-MI !
!
ij
Oij · log2
Oij
Eij
Dice !2O11
R1 $ C1odds-ratio ! log
"
O11 $ 12
#"
O22 $ 12
#
"
O12 $ 12
#"
O21 $ 12
#
Collocates of "bucket" in BNC (from Evert 2008)
Comparison
30
collocate f f2 simple-ll
water 184 37012 1083.18a 590 2164246 449.30spade 31 465 342.31plastic 36 4375 247.65size 42 14448 203.36slop 17 166 202.30mop 20 536 197.68throw 38 11308 194.66fill 37 10722 191.44with 196 658584 171.78
collocate f f2 MI
fourteen-record 4 4 13.31ten-record 3 3 13.31multi-record 2 2 13.31two-record 2 2 13.31a-row 1 1 13.31anti-sweat 1 1 13.31axe-blade 1 1 13.31bastarding 1 1 13.31dippermouth 1 1 13.31Dok 1 1 13.31
collocate f f2 t-score
a 590 2164246 15.53water 184 37012 13.30and 479 2616723 10.14with 196 658584 9.38of 497 3040670 8.89the 832 6041238 8.26into 87 157565 7.67size 42 14448 6.26in 298 1937966 6.23record 43 29404 6.12
collocate f ! 5 f2 MI
single-record 5 8 12.63randomize 10 57 10.80slop 17 166 10.03spade 31 465 9.41mop 20 536 8.57oats 7 286 7.96shovel 8 358 7.83rhino 7 326 7.77synonym 7 363 7.62bucket 18 1356 7.08
Pecina (2005) collects 57 association measures (and some other formulae)
31
So many measures, so little time …
# Name Formula
1. Mean component offset 1n
Pni=1 di
2. Variance component offset 1n−1
Pni=1
`di−d
´2
3. Joint probability P (xy)
4. Conditional probability P (y|x)
5. Reverse conditional prob. P (x|y)
�6. Pointwise mutual inform. logP (xy)
P (x∗)P (∗y)
7. Mutual dependency (MD) logP (xy)2
P (x∗)P (∗y)
8. Log frequency biasedMD logP (xy)2
P (x∗)P (∗y)+log P (xy)
9. Normalized expectation2f(xy)
f(x∗)+f(∗y)�10. Mutual expectation
2f(xy)f(x∗)+f(∗y) ·P (xy)
11. Salience logP (xy)2
P (x∗)P (∗y) · logf(xy)
12. Pearson’s χ2 testP
i,j
(fij−fij)2
fij
13. Fisher’s exact testf(x∗)!f(x∗)!f(∗y)!f(∗y)!
N!f(xy)!f(xy)!f(xy)!f(xy)!
14. t test f(xy)−f(xy)√f(xy)(1−(f(xy)/N))
15. z score f(xy)−f(xy)√f(xy)(1−(f(xy)/N))
16. Poison significance measuref(xy)−f(xy) logf(xy)+logf(xy)!
logN
17. Log likelihood ratio −2P
i,jfij logfij
fij
18. Squared log likelihood ratio −2P
i,j
logfij2
fij
Association coefficients:
19. Russel-Rao aa+b+c+d
20. Sokal-Michiner a+da+b+c+d
�21. Rogers-Tanimoto a+da+2b+2c+d
22. Hamann(a+d)−(b+c)
a+b+c+d
23. Third Sokal-Sneath b+ca+d
24. Jaccard aa+b+c
�25. First Kulczynsky ab+c
26. Second Sokal-Sneath aa+2(b+c)
27. Second Kulczynski 12 ( a
a+b + aa+c )
28. Fourth Sokal-Sneath 14 ( a
a+b + aa+c + d
d+b + dd+c )
29. Odds ratio adbc
30. Yulle’s ω√
ad−√
bc√ad+
√bc
�31. Yulle’sQ ad−bcad+bc
32. Driver-Kroeber a√(a+b)(a+c)
33. Fifth Sokal-Sneath ad√(a+b)(a+c)(d+b)(d+c)
34. Pearson ad−bc√(a+b)(a+c)(d+b)(d+c)
35. Baroni-Urbani a+√
ad
a+b+c+√
ad
36. Braun-Blanquet amax(a+b,a+c)
37. Simpson amin(a+b,a+c)
38. Michael4(ad−bc)
(a+d)2+(b+c)2
39. Mountford 2a2bc+ab+ac
40. Fager a√(a+b)(a+c)
− 12max(b, c)
41. Unigram subtuples log adbc−3.29
q1a + 1
b + 1c + 1
d
42. U cost log(1+min(b,c)+amax(b,c)+a )
43. S cost log(1+min(b,c)
a+1 )−1
2
44. R cost log(1+ aa+b )·log(1+ a
a+c )
45. T combined cost√
U×S×R
46. PhiP (xy)−P (x∗)P (∗y)√
P (x∗)P (∗y)(1−P (x∗))(1−P (∗y))
47. KappaP (xy)+P (xy)−P (x∗)P (∗y)−P (x∗)P (∗y)
1−P (x∗)P (∗y)−P (x∗)P (∗y)
48. J measure max[P (xy)logP (y|x)P (∗y) +P (xy)log
P (y|x)P (∗y) ,
P (xy)logP (x|y)P (x∗) +P (xy)log
P (x|y)P (x∗) ]
# Name Formula
49. Gini index max[P (x∗)(P (y|x)2+P (y|x)2)−P (∗y)2
+P (x∗)(P (y|x)2+P (y|x)2)−P (∗y)2,
P (∗y)(P (x|y)2+P (x|y)2)−P (x∗)2
+P (∗y)(P (x|y)2+P (x|y)2)−P (x∗)2]
50. Confidence max[P (y|x), P (x|y)]
51. Laplace max[ NP (xy)+1NP (x∗)+2 , NP (xy)+1
NP (∗y)+2 ]
52. Conviction max[ P (x∗)P (∗y)P (xy) , P (x∗)P (∗y)
P (xy) ]
53. Piatersky-Shapiro P (xy)−P (x∗)P (∗y)
54. Certainity factor max[ P (y|x)−P (∗y)1−P (∗y) , P (x|y)−P (x∗)
1−P (x∗) ]
55. Added value (AV) max[P (y|x)−P (∗y), P (x|y)−P (x∗)]�56. Collective strength
P (xy)+P (xy)P (x∗)P (y)+P (x∗)P (∗y) ·
1−P (x∗)P (∗y)−P (x∗)P (∗y)1−P (xy)−P (xy)
57. Klosgenp
P (xy) ·AV
Context measures:�58. Context entropy −
Pw P (w|Cxy) logP (w|Cxy)
59. Left context entropy −P
w P (w|Clxy) logP (w|Cl
xy)
60. Right context entropy −P
w P (w|Crxy) logP (w|Cr
xy)
�61. Left context divergence P (x∗) logP (x∗)−
PwP (w|Cl
xy) logP (w|Clxy)
62. Right context divergence P (∗y) logP (∗y)
−P
wP (w|Crxy) logP (w|Cr
xy)
63. Cross entropy −P
wP (w|Cx) log P (w|Cy)
64. Reverse cross entropy −P
wP (w|Cy) log P (w|Cx)
65. Intersection measure2|Cx∩Cy||Cx|+|Cy|
66. Euclidean normqP
w(P (w|Cx)−P (w|Cy))2
67. Cosine norm
Pw P (w|Cx)P (w|Cy)
Pw P (w|Cx)2·
Pw P (w|Cy)2
68. L1 normP
w |P (w|Cx)−P (w|Cy)|
69. Confusion probabilityP
wP (x|Cw)P (y|Cw)P (w)
P (x∗)
70. Reverse confusion prob.P
wP (y|Cw)P (x|Cw)P (w)
P (∗y)�71. Jensen-Shannon diverg. 1
2 [D(p(w|Cx)|| 12 (p(w|Cx)+p(w|Cy)))
+D(p(w|Cy)|| 12 (p(w|Cx)+p(w|Cy)))]
72. Cosine of pointwiseMIP
w MI(w,x)MI(w,y)√Pw MI(w,x)2·
√Pw MI(w,y)2
�73. KL divergenceP
w P (w|Cx) logP (w|Cx)P (w|Cy)
�74. Reverse KL divergenceP
w P (w|Cy) logP (w|Cy)P (w|Cx)
75. Skew divergence D(p(w|Cx)||α(w|Cy)+(1−α)p(w|Cx))
76. Reverse skew divergence D(p(w|Cy)||αp(w|Cx)+(1−α)p(w|Cy))
77. Phrase word coocurrence 12 (
f(x|Cxy)f(xy) +
f(y|Cxy)f(xy) )
78.Word association 12 (
f(x|Cy)−f(xy)f(xy) + f(y|Cx)−f(xy)
f(xy) )
Cosine context similarity: 12 (cos(cx,cxy)+cos(cy,cxy))
cz=(zi); cos(cx,cy)=P
xiyi√Pxi
2·√P
yi2
�79. in boolean vector space zi =δ(f(wi|Cz))
80. in tf vector space zi =f(wi|Cz)
81. in tf·idf vector space zi =f(wi|Cz)· Ndf(wi)
; df(wi)= |{x :wi�Cx}|
Dice context similarity: 12 (dice(cx,cxy)+dice(cy,cxy))
cz=(zi); dice(cx,cy)=2
PxiyiP
xi2+
Pyi
2�82. in boolean vector space zi =δ(f(wi|Cz))
�83. in tf vector space zi =f(wi|Cz)�84. in tf·idf vector space zi =f(wi|Cz)· N
df(wi); df(wi)= |{x :wi�Cx}|
Linguistic features:�85. Part of speech {Adjective:Noun, Noun:Noun, Noun:Verb, . . . }
�86. Dependency type {Attribute, Object, Subject, . . . }
87. Dependency structure {�,�}
?32
Which measure?
33
How to choose an association measure
Mathematical discussion
Direct comparison
Task-based evaluation
Geometric interpretation■ combine with insights from task-based evaluation
Poisson-likelihood � e−E · �E�O
O!
34
Significance of association
log-likelihood � 2�
ij
Oij logOij
Eij
Fisher �min{R1,C1}�
k�O11
�C1
k
�·�
C2
R1 − k
�
�N
R1
�
exact hypothesis tests
asymptotic hypothesis tests
likelihood measures
simple-ll � 2 ·�O · log
O
E− �O − E�
�
Poisson-Stirling � O ·�logO − logE − 1
�
t-score �O − E√
O
z-score �O − E√
E
simple hypothesis tests
chi-squared ��
i,j
�Oij − Eij�2
Eij
�N
E22· �O − E�2
E
Poisson �∞�
k�O
e−E E
k
k!
gmean �√pF · pB �
O11√R1C1
�O√NE
MI � log2O
E
35
Degree of association / determination
relative-risk � logO11C2
O12C1
odds-ratio � logO11O22
O12O21
gmean �O11√R1C1
�O11√NE11
pF � Pr�w2 | w1�pB � Pr�w1 | w2�
Dice ��
12pF
�1
2pB
�−1
�2O11
R1 � C1
MS � min{pF, pB} � min�O11
R1,O11
C1
�
�O11
max{R1, C1}measures ofnon-independence
measures of(mutual) determination
Comparison of p-values on simulated data (see Evert 2004, 2008)
Direct comparison of association scores
36
1 10 100 1000 10000
110
100
1000
1000
0Fisher
(see
lege
nd)
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!!!
!!
!
!!
!
!
!!
!
!!
!
!!
!
!
!
!
!
!
!
!!
!
!!
!!
!
!
!
!
!
!
!!!
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!!
!!!
!
!
!!
!
!
!
!
!
!!
!!!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!!!!
!!!
!
!
!!!
!
!!
!
!
!
!
!
!
!!!
!
!
!!
!
!
!
!
!
!
!
!!
!
!!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!!
!
!!
!
!!
!!
!!
!!!
!!
!
!!
!
!
!!
!
!!
!
!!
!
!
!
!
!
!
!
!!
!
!!
!!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!!
!!
!
!
!
!!
!
!
!
!
!
!!
!!!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!!!!
!!!
!
!
!!!
!
!!
!
!
!
!
!
!
!!!
!
!
!!
!
!
!
!
!
!
!
!!
!
!!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!!
!!
!!
!!!
!!
!!!
!
!
!!
!
!!!
!!
!
!
!
!
!
!
!
!!!
!!
!!!
!
!!
!!
!!!
!!
!
!
!
!
!
!
!
!
!
!!!
!!
!!
!!!
!
!
!!
!
!
!
!
!
!!
!!!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!!
!
!!
!!
!!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!!
!
!!
!
!!
!
!!!
!
!!!
!
!!!!
!!!!
!
!!
!
!
!!
!
!
!!!
!
!
!!
!
!
!
!
!
!
!
!!
!
!!
!!
!
!
!
!
!!
!!
!
!
!
!
!
!!
!
!
!!
!!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!!!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!!
!
!!
!
!!
!
!
!
!
!
!
!!
!
!!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!! !
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!
!!
!
!
!!
!
!
!
!
!
!!
!
!!!
!!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!!!!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!!
!
!
!!
!
!
!
!!!
!
!
!
!
!
!!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!!
!
!
!!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!!
!
!!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!!
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!!
!
!
!!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!
!
!!
!
!!
!!
!
!
!
!
chi−squaredPoissonlog−likelihoodt−score
Me
an
ave
rag
e p
recis
ion
0.2
0.3
0.4
0.5
0.6
0.7
77 80 32 30 10 42 4 28 63 22 23 7 20 19 43 6 9 50 48 8 59 73 61 25 11 74 68 53 52 35 41 55 47 81 46 2 51 78 58 57 1739 38 31 13 5 37 27 29 16 24 45 33 21 18 34 54 76 3 82 44 66 71 26 15 14 72 70 64 49 65 69 40 75 56 12 60 36 79 62 1 67 77 38 30 5 4 29 22 45 20 18 6 76 48 44 73 26 11 72 53 49 41 40 81 12 51 79 57 67
67577951128140414953721126734448766182045222945303877
Figure 3: a) Mean average precision of all association measures in descending order. Methods are referred by numbersfrom Table 1. The solid points correspond to measures selected by the model reduction algorithm from Section 5. b) Visu-alization of p-values from the significance tests of difference between each method pair (order is the same for both graphs). Thedarker points correspond to p-values greater than α=0.1 and indicate methods with statistically indistinguishable performance(measured by paired Wilcoxon test on values of average precision obtained from five independent data folds).
3.4 Experiments and results
In the initial experiments, we implemented all 82association measures from Table 1, processed allmorphologically and syntactically annotated sen-tences from PDT 2.0, and computed scores of allthe association measures for each dependency bi-gram in the reference data. For each associa-tion measure and each of the five evaluation datafolds, we computed precision-recall scores anddrew an averaged precision-recall curve. Curvesof some well-performing methods are depicted inFigure 2. Next, for each association measure andeach data fold, we estimated scores of average pre-cision on narrower recall interval �0.1,0.9�, com-puted mean average precision, ranked the asso-ciation measures according to MAP in descend-ing order, and result depicted in Figure 3 a). Fi-nally, we applied a paired Wilcoxon test, detectedmeasures with statistically indistinguishable per-formance, and visualized this information in Fig-ure 3 b).
A baseline system ranking bigrams randomlyoperates with average precision of 20.9%. Thebest performing method for collocation extrac-tion measured by mean average precision is co-sine context similarity in boolean vector space (77)(MAP 66.49%) followed by other 16 associa-tion measures with nearly identical performance(Figure 3 a). They include some popular meth-ods well-known to perform reliably in this task,such as pointwise mutual information (4), Pear-son’s χ2 test (10), z score (13), odds ratio (27), orsquared log likelihood ratio (16).
The interesting point to note is that, in termsof MAP, context similarity measures, e.g. (77),slightly outperform measures based on simple oc-
curence frequencies, e.g. (39). In a more thoroughcomparison by percision-recall curves, we observethat the former very significantly predominates thelatter in the first half of the recall interval and viceversa in the second half (Figure 2). This is a casewhere the MAP is not a sufficient metric for com-parison of association measure performance. It isalso worth pointing out that even if two methodshave the same precision-recall curves the actual bi-gram rank order can be very different. Existenceof such non-correlated (in terms of ranking) mea-sures will be essential in the following sections.
4 Combining association measures
Each collocation candidate xi can be described bythe feature vector xi = (xi
1, . . . , xi82)T consisting
of 82 association scores from Table 1 and assigneda label yi ∈ {0, 1} which indicates whether thebigram is considered to be a collocation (y = 1)or not (y = 0). We look for a ranker functionf(x)→R that determines the strength of lexicalassociation between components of bigram x andhence has the character of an association measure.This allows us to compare it with other associationmeasures by the same means of precision-recallcurves and mean average precision. Further, wepresent several classification methods and demon-strate how they can be employed for ranking, i.e.what function can be used as a ranker. For refer-ences see Venables and Ripley (2002).
4.1 Linear logistic regressionAn additive model for binary response is repre-sented by a generalized linear model (GLM) ina form of logistic regression:
logit(π) = β0 + β1x1 + . . . + βpxp
655
Direct comparison of AM scores
Pecina & Schlesinger (2006) perform asystematic comparison
Main result: severalgroups of highlycorrelated or evenvirtually identical AMs
37
Empirical studies: MWE evaluation
38
AM are used for ranking candidates in MWE extraction tasks
Evaluation in terms of precision of n-best lists
Gold standard■ expert judgements of
“usefulness” (for app.)■ linguistically defined
(subtypes of) MWE■ always requires manual
annotation of data!0 1000 2000 3000 4000 5000
010
2030
4050
n−best list
prec
isio
n (%
)
baseline = 11.09%
t−score
Empirical studies: MWE evaluation
39
German PP-verb pairs from FR corpus (f ≥ 30)
MWE annotated by Brigitte Krenn (2000)■ Funktionsvergefüge (FVG)■ figurative expressions
Data & guidelines: www.collocations.de
0 1000 2000 3000 4000 5000
010
2030
4050
n−best list
prec
isio
n (%
)
baseline = 11.09%
t−scorePoissonz−scorefrequencyMI
0 20 40 60 80 100
010
20
30
40
50
60
German PNV
Recall (%)
Pre
cis
ion
(%
)
G2
X2
t
MI
Dice
f
http://multiword.sf.net/mwe2008/
MWE 2008 Shared Task: DE-PNV
Shared task on German V+PP■ FVG■ figurative
Frequency data from FR, chunk parsed, f ≥ 30
Baseline:11.09%
Best AM: t-scoreAP = 39.79%
Frequency:AP = 33.88%
40
0 20 40 60 80 100
010
20
30
40
50
60
English VPC
Recall (%)
Pre
cis
ion
(%
)
G2
X2
t
MI
Dice
f
http://multiword.sf.net/mwe2008/
MWE 2008 Shared Task: EN-VPC
Shared task on English particle verbs (VPC)
Frequency data from full BNC■ adjacent pairs
Baseline:14.29%
Best AM: t-scoreAP = 29.94%
Frequency:AP = 29.01%
41
0 20 40 60 80 100
10
20
30
40
50
60
70
German AN
Recall (%)
Pre
cis
ion
(%
)
G2
X2
t
MI
Dice
f
http://multiword.sf.net/mwe2008/
MWE 2008 Shared Task: DE-AN
Shared task on German Adj+N■ evaluated by
professional lexicographers
Frequency data from FR corpus
Baseline:41.53%
Best AM: DiceAP = 58.84%
Frequency:AP = 46.90%
42
0 20 40 60 80 100
20
30
40
50
60
70
80
Czech MWE Bigrams
Recall (%)
Pre
cis
ion
(%
)
G2
X2
t
MI
Dice
f
http://multiword.sf.net/mwe2008/
MWE 2008 Shared Task: CZ-MWE
Shared task on Czech MWE■ evaluated by
lexicographers■ three judges
Baseline:21.03%
Best AM: chi-sq.AP = 64.86%
Frequency:AP = 21.70%
43
See Evert (2004, 2008) for details
EO
0.001 0.01 0.1 1 10
1010
0
t−score true MWE
Geometric visualisation of AMs
44
See Evert (2004, 2008) for details
Expected Frequency
Obs
erve
d Fr
eque
ncy
1e−03 1e−02 1e−01 1e+00 1e+01
510
2050
100
t−scoren = 20n = 50n = 100n = 200
Geometric visualisation of AMs
45
Expected Frequency
Obs
erve
d Fr
eque
ncy
1e−03 1e−02 1e−01 1e+00 1e+01
510
2050
100
MIn = 20n = 50n = 100n = 200
Expected Frequency
Obs
erve
d Fr
eque
ncy
1e−03 1e−02 1e−01 1e+00 1e+01
510
2050
100
Poissonn = 20n = 50n = 100n = 200
Expected Frequency
Obs
erve
d Fr
eque
ncy
1e−03 1e−02 1e−01 1e+00 1e+01
510
2050
100
t−scoren = 20n = 50n = 100n = 200
Expected Frequency
Obs
erve
d Fr
eque
ncy
1e−03 1e−02 1e−01 1e+00 1e+01
510
2050
100
z−scoren = 20n = 50n = 100n = 200
Geometric visualisation of AMs
46
Geometric visualisation of AMs
47
Evaluation & visualisation combined
48
Evaluation & visualisation combined
49
Evaluation & visualisation combined
50
51
Room for improvement
Results from MWE 2008 Shared Task: DE-PNV
Machine learning (Pecina & Schlesinger 2006)
52
0
10
20
30
40
50
60
70
Baseline G2 X2 t-score MI Dice freq PE EPI Best AM ML
60.6
43.9
22.7
14.6
33.9
20.1
24.4
39.8
30.0
39.1
11.1
Results from MWE 2008 Shared Task: CZ-MWE
Machine learning (Pecina & Schlesinger 2006)
53
0
10
20
30
40
50
60
70
80
Baseline G2 X2 t-score MI Dice freq PE EPI Best AM ML
79.5
65.6
21.7
62.664.9
24.6
65.0
42.5
21.0
Results from MWE 2008 Shared Task: DE-AN
Machine learning (Pecina & Schlesinger 2006)
54
0
10
20
30
40
50
60
70
Baseline G2 X2 t-score MI Dice freq PE EPI Best AM ML
61.362.9
40.4
46.9
58.8
54.752.4
57.156.1
41.5
Upper limits: overtraining
55
What is the highest precision that a "sensible" AM can achieve in principle?
Like a highly over-trained machine learning approach
Restriction needed: simple AM
0 20 40 60 80 100
020
4060
8010
0
recall (%)
prec
isio
n (%
)
baseline = 11.09%
t−scorePoissonz−scorefrequencyMI
Upper limits: optimal simple AM
56
E
O
0.001 0.01 0.1 1 10
10100
t−scorez−scoreoptimal
E
O
0.001 0.01 0.1 1 10
10100
t−scorez−scoreoptimal
Do AMs scale up to the Web?
57E
O
0.01 0.1 1 10 100 10000 1e+06 1e+08
0.1
110
100
1000
01e
+06
1e+0
8
!
!
!
!
!
!
!
!
!
!
!!
!
!
! !
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
! !
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!!
!
!!
!
! !
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!!
!!!
!
!
!
!
!
!!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!!
!
!!
!
!
!!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!!!
!
!
!
!
!
!!
!!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
!
!
! !
!
!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)
Do AMs scale up to the Web?
58E
O
0.01 0.1 1 10 100 10000 1e+06 1e+08
0.1
110
100
1000
01e
+06
1e+0
8
!
!
!
!
!
!
!
!
!
!
!!
!
!
! !
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
! !
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!!
!
!!
!
! !
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!!
!!!
!
!
!
!
!
!!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!!
!
!!
!
!
!!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!!!
!
!
!
!
!
!!
!!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
!
!
! !
!
!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)
t
Do AMs scale up to the Web?
59E
O
0.01 0.1 1 10 100 10000 1e+06 1e+08
0.1
110
100
1000
01e
+06
1e+0
8
!
!
!
!
!
!
!
!
!
!
!!
!
!
! !
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
! !
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!!
!
!!
!
! !
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!!
!!!
!
!
!
!
!
!!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!!
!
!!
!
!
!!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!!!
!
!
!
!
!
!!
!!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
!
!
! !
!
!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)
ttG2
Do AMs scale up to the Web?
60E
O
0.01 0.1 1 10 100 10000 1e+06 1e+080.
11
1010
010
000
1e+0
61e
+08
!
!
!
!
!
!
!
!
!
!
!!
!
!
! !
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
! !
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!!
!
!!
!
! !
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!!
!!!
!
!
!
!
!
!!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!!
!
!!
!
!
!!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!!!
!
!
!
!
!
!!
!!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
!
!
! !
!
!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)
ttG2tG2
MIMI3
Do AMs scale up to the Web?
61E
O
0.01 0.1 1 10 100 10000 1e+06 1e+08
0.1
110
100
1000
01e
+06
1e+0
8
!
!
!
!
!
!
!
!
!
!
!!
!
!
! !
!!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
! !
!!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!!
!
!!
!
! !
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!!
!!!
!
!
!
!
!
!!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!!
!
!!
!
!
!!
!
!
!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!!
!
!
!
!
!!
!!!
!
!
!
!
!
!!
!!
!
!
!
!
!!
!
!
!
!
!
!
!!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!!
!
!
!
!
!
! !
!
!
!!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
! !
!
!!
!
!
!
!
!!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
TP (BNC)FP (BNC)TP (Web1T5)FP (Web1T5)
tG2
MIMI3
62
What else?
63
Some current research topics (my agenda)
Optimised AMs for specific types of tasks and data sets■ e.g. for identification of SVC vs. idioms■ for very small or very large corpora, skewed frequency dist's
Extension to combinations of three or more words■ particularly important for MWE, but also empirical collocations■ basis for higher-order distributional semantics (tensors)
Asymmetric association measures (Michelbacher et al. 2011)■ e.g. wellington boot, bated breath, high fidelity■ virtually all statistical AM are symmetric
Collocational patterns: productivity of collocations■ integration of collocations with distributional similarity
(A)symmetry of association
64
Collocations are often asymmetric (Kjellmer 1991)■ e.g. wellington boot, bated breath, high fidelity■ bated breath is “right-predictive”, high fidelity is “left-predictive”■ effect may in part be due to frequency of collocates
Well-known fact, but little research in linguistics & NLP■ MWE and semantic relations are inherently symmetric■ most sensible measures of 1st- and 2nd-order statistical
association are also symmetric■ including all association measures mentioned in this talk
Asymmetric association measures
Mathematically founded derivations lead to symmetric AM■ how can asymmetry of association be accounted for?
Michelbacher et al. (2007): forward vs. backward rank
65
“bated” (log-likelihood)“bated” (log-likelihood)“bated” (log-likelihood)collocate score rankbreath 339.10 1with 99.75 2
waited 75.02 3waiting 50.91 4
and 0.88 5, 0.00 6. -0.11 7
the -0.91 8
“breath” (log-likelihood)“breath” (log-likelihood)“breath” (log-likelihood)collocate score rank
deep 6787.38 1took 3207.68 2her 2812.36 3his 2100.52 4… … …
shuddering 399.37 19bated 376.96 20draw 343.53 21
Asymmetric association measures
Michelbacher et al. (2007):forward vs. backward rank
Asymmetric AM (AAM): score = difference between forward & backward rank
Various AAM can be defined (one for each symmetric AM)
Plot shows distribution of forward and backward ranks■ based on log-likelihood AM■ for symmetric A, largest bars
would be on the diagonal
66
1−23−56−1011−20
21−3536−60
61−100101−160
161−250
251−500
501+501+
251−500161−250
101−16061−100
36−6021−35
11−206−10
3−51−2
Forward Rank
Backward Rank
Num
ber o
f Pai
rs
AAM evaluation results (work in progress)
Free associations are often asymmetric
Michelbacher et al. (2007) evaluate AAM on USF free A norms
Results are inconlusive■ presumably because
free association norms are mostly based on paradigmatic relations
■ 1st-order statistical A is syntagmatic (is it?)
67
USF free associationsUSF free associationsUSF free associationsUSF free associations
cue target fwd A bwd A
boys girls 0.500 0.503
bad good 0.750 0.758
dinner supper 0.535 0.545
trout fish 0.913 0.036
saddle horse 0.879 0.103
crib baby 0.842 0.032
exhausted tired 0.895 0.075
bank money 0.799 0.019
bouquet flowers 0.828 0.053
AAM evaluation results (work in progress)
Evaluation on new data set of free syntagmatic A■ similar to free A
norms, but asksfor syntagmaticcombination
Michelbacher et al. (2011)■ fwd/bwd ranks
for different AM■ compared to
syntagmatic A
68
Michelbacher, Evert and Schütze Asymmetry in Word Associations
Table 5Forward and backward scores and rank measures for a subset of the word pairs used in theelicitation experiment
f b (w1, w2) R→f R←f R→G2 R←
G2 R→t R←t R→χ2 R←
χ2
group A: rank measures and direction scores conform
0.5891 0.2545 Academy Award 1 9 1 2 1 7 1 20.3328 0.0010 ancestral home 1 25 1 13 1 19 1 110.5551 0.1609 cable television 2 7 1 4 2 5 1 20.0127 0.0087 cut glass 1 75 1 46 1 58 1 400.6760 0.0010 felled tree 1 54 1 33 1 45 1 170.0683 0.0021 hunched shoulders 1 16 1 7 1 14 1 20.0875 0.0010 old-fashioned way 1 98 1 60 1 62 4 700.1667 0.0063 rightful place 1 26 1 6 1 15 2 40.1500 0.0496 rope ladder 1 4 1 4 1 4 2 40.0241 0.0010 shrewd idea 3 109 6 49 3 68 10 410.1719 0.0010 thick-set man 1 519 1 169 1 318 1 860.0641 0.0068 well-worn path 1 71 1 34 1 58 1 220.0127 0.0125 *impending retirement 9 18 8 14 9 18 9 90.0606 0.0563 *speech recognition 1 2 1 1 1 2 2 40.0010 0.0099 annual rent 29 2 20 1 28 2 15 70.0266 0.8208 Christmas decorations 11 1 8 1 11 1 9 30.0010 0.0101 female preferences 63 34 92 44 60 34 95 480.0010 0.0650 hard frost 39 1 21 1 35 1 10 10.0010 0.0312 legal wrangling 151 1 58 1 110 1 25 10.0081 0.1325 smoked mackerel 5 1 3 1 5 1 3 10.0010 0.0031 southern bypass 21 1 15 1 20 1 10 10.0046 0.0426 welcome diversion 17 3 15 1 16 3 8 30.0032 0.0425 *bond issuance 10 1 7 1 10 1 2 20.0549 0.4500 *deadly nightshade 7 1 3 1 7 1 1 10.0478 0.3893 *white collar 11 1 6 1 9 1 5 6
group B: rank measures and direction scores do not conform
0.0955 0.1160 healthy food 6 19 6 20 5 15 13 530.1562 0.1543 missile silos 16 1 8 1 16 1 2 10.0010 0.0063 seasoned campaigners 1 9 1 6 1 9 1 6
group C: rank measures ambivalent
0.5411 0.4620 epileptic seizure 2 3 2 1 2 3 2 10.0761 0.0335 dedicated follower 7 3 2 3 4 3 3 60.4340 0.0237 laboratory experiments 2 1 1 1 2 1 1 10.0683 0.1836 South East 1 2 3 2 1 2 5 2
group D: high mutual predictiveness
0.2962 0.1337 bloody hell 1 1 1 1 1 1 1 10.1275 0.2833 *special needs 1 1 1 1 1 1 1 20.6810 0.2793 toxic waste 1 1 1 1 1 1 1 10.2613 0.1583 unleaded petrol 1 1 1 1 1 1 1 10.9521 0.0068 wishful thinking 1 1 1 1 1 1 1 1
17
AAM evaluation results (work in progress)
Some results good■ previous slide
Other results are less encouraging■ AAM are unclear
or contradict syntagmatic A
wishful thinking■ fwd/bwd rank 1
for all AM■ right-predictive
in human data
69
Michelbacher, Evert and Schütze Asymmetry in Word Associations
Table 5Forward and backward scores and rank measures for a subset of the word pairs used in theelicitation experiment
f b (w1, w2) R→f R←f R→G2 R←
G2 R→t R←t R→χ2 R←
χ2
group A: rank measures and direction scores conform
0.5891 0.2545 Academy Award 1 9 1 2 1 7 1 20.3328 0.0010 ancestral home 1 25 1 13 1 19 1 110.5551 0.1609 cable television 2 7 1 4 2 5 1 20.0127 0.0087 cut glass 1 75 1 46 1 58 1 400.6760 0.0010 felled tree 1 54 1 33 1 45 1 170.0683 0.0021 hunched shoulders 1 16 1 7 1 14 1 20.0875 0.0010 old-fashioned way 1 98 1 60 1 62 4 700.1667 0.0063 rightful place 1 26 1 6 1 15 2 40.1500 0.0496 rope ladder 1 4 1 4 1 4 2 40.0241 0.0010 shrewd idea 3 109 6 49 3 68 10 410.1719 0.0010 thick-set man 1 519 1 169 1 318 1 860.0641 0.0068 well-worn path 1 71 1 34 1 58 1 220.0127 0.0125 *impending retirement 9 18 8 14 9 18 9 90.0606 0.0563 *speech recognition 1 2 1 1 1 2 2 40.0010 0.0099 annual rent 29 2 20 1 28 2 15 70.0266 0.8208 Christmas decorations 11 1 8 1 11 1 9 30.0010 0.0101 female preferences 63 34 92 44 60 34 95 480.0010 0.0650 hard frost 39 1 21 1 35 1 10 10.0010 0.0312 legal wrangling 151 1 58 1 110 1 25 10.0081 0.1325 smoked mackerel 5 1 3 1 5 1 3 10.0010 0.0031 southern bypass 21 1 15 1 20 1 10 10.0046 0.0426 welcome diversion 17 3 15 1 16 3 8 30.0032 0.0425 *bond issuance 10 1 7 1 10 1 2 20.0549 0.4500 *deadly nightshade 7 1 3 1 7 1 1 10.0478 0.3893 *white collar 11 1 6 1 9 1 5 6
group B: rank measures and direction scores do not conform
0.0955 0.1160 healthy food 6 19 6 20 5 15 13 530.1562 0.1543 missile silos 16 1 8 1 16 1 2 10.0010 0.0063 seasoned campaigners 1 9 1 6 1 9 1 6
group C: rank measures ambivalent
0.5411 0.4620 epileptic seizure 2 3 2 1 2 3 2 10.0761 0.0335 dedicated follower 7 3 2 3 4 3 3 60.4340 0.0237 laboratory experiments 2 1 1 1 2 1 1 10.0683 0.1836 South East 1 2 3 2 1 2 5 2
group D: high mutual predictiveness
0.2962 0.1337 bloody hell 1 1 1 1 1 1 1 10.1275 0.2833 *special needs 1 1 1 1 1 1 1 20.6810 0.2793 toxic waste 1 1 1 1 1 1 1 10.2613 0.1583 unleaded petrol 1 1 1 1 1 1 1 10.9521 0.0068 wishful thinking 1 1 1 1 1 1 1 1
17
70
Thank you!
References
71
Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.
Burger, Harald; Buhofer, Annelies; Sialm, Ambros (1982). Handbuch der Phraseologie. De Gruyter, Berlin, New York.
Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Dissertation, Institut für maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93-opus-23714. Also see online resources at http://www.collocations.de/.
Evert, Stefan (2008). Corpora and collocations. In A.!Lüdeling and M.!Kytö (eds.), Corpus Linguistics. An International Handbook, chapter!58. Mouton de Gruyter, Berlin.
Evert, Stefan and Kermes, Hannah (2003). Experiments on candidate data for collocation extraction. In Companion Volume to the Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics, pages 83–86.
Evert, Stefan and Krenn, Brigitte (2001). Methods for the qualitative evaluation of lexical association measures. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pages 188–195, Toulouse, France.
Evert, Stefan and Krenn, Brigitte (2005). Using small random samples for the manual evaluation of statistical association measures. Computer Speech and Language, 19(4), 450–466.
Evert, Stefan and Krenn, Brigitte (2005a). Exploratory collocation extraction. Presentation at the Phraseology 2005 Conference, Louvain-la-Neuve, Belgium.
Firth, J.!R. (1957). A synopsis of linguistic theory 1930–55. In Studies in linguistic analysis, pages 1–32. The Philological Society, Oxford.
Harris, Zellig (1954). Distributional structure. Word, 10(23), 146–162.
Hoey, Michael (2005). Lexical Priming: A new theory of words and language. Routledge, London.
Hoffmann, Sebastian; Evert, Stefan; Smith, Nicholas; Lee, David; Berglund!Prytz, Ylva (2008). Corpus Linguistics with BNCweb – a Practical Guide, volume!6 of English Corpus Linguistics. Peter Lang, Frankfurt am Main.
References
72
Kilgarriff, Adam and Tugwell, David (2002). Sketching words. In M.-H. Corréard (ed.), Lexicography and Natural Language Processing: A Festschrift in Honour of B. T. S. Atkins, pages 125–137. EURALEX.
Kilgarriff, Adam; Rychly, Pavel; Smrz, Pavel; Tugwell, David (2004). The sketch engine. In Proceedings of the 11th EURALEX International Congress, Lorient, France.
Kjellmer, Göran (1991). A mint of phrases. In K.!Aijmer and B.!Altenberg (eds.), English Corpus Linguistics, pages 111–127. Longman, London.
Krenn, Brigitte (2000). The Usual Suspects: Data-Oriented Models for the Identification and Representation of Lexical Collocations, Dissertation. DFKI & Universität des Saarlandes, Saarbrücken, Germany.
Landauer, Thomas!K. and Dumais, Susan!T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2), 211–240.
Lea, Diana (ed.) (2002). Oxford Collocations Dictionary for students of English. Oxford University Press, Oxford, New York.
Lin, Dekang (1998). Automatic retrieval and clustering of similar words. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL 1998), pages 768–774, Montreal, Canada.
Michelbacher, Lukas; Evert, Stefan; Schütze, Hinrich (2007). Asymmetric association measures. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2007), Borovets, Bulgaria.
Michelbacher, Lukas; Evert, Stefan; Schütze, Hinrich (2011). Asymmetry in corpus-derived and human word associations. Corpus Linguistics and Linguistic Theory, 7(2), 245–276.
Pecina, Pavel (2005). An extensive empirical study of collocation extraction methods. In Proceedings of the ACL Student Research Workshop, pages 13–18, Ann Arbor, MI.
Pecina, Pavel and Schlesinger, Pavel (2006). Combining association measures for collocation extraction. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), Poster Sessions, pages 651–658, Sydney, Australia. ACL.
Rapp, Reinhard (2004). A freely available automatically generated thesaurus of related words. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), pages 395–398.
References
73
Sahlgren, Magnus (2006). The Word Space Model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Department of Linguistics, Stockholm University.
Schütze, Hinrich (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–123.
Sinclair, John!McH. (1966). Beginning the study of lexis. In C.!E. Bazell, J.!C. Catford, M.!A.!K. Halliday, and R.!H. Robins (eds.), In Memory of J. R. Firth, pages 410–430. Longmans, London.
Sinclair, John; Jones, Susan; Daley, Robert; Krishnamurthy, Ramesh (1970/2004). English Collocation Studies: The OSTI Report. Continuum Books, London and New York. Originally written in 1970 (unpublished).
Sinclair, John (1991). Corpus, Concordance, Collocation. Oxford University Press, Oxford.
Sinclair, John (ed.) (1995). Collins COBUILD English Dictionary. Harper Collins, London. New edition, completely revised.