![Page 1: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/1.jpg)
Expressing Implicit Semantic Relations without Supervision
ACL 2006
![Page 2: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/2.jpg)
2
Abstract
• For a given input word pair X:Y with unspecified semantic relations– The corresponding output list of patterns <p1,…pm> is
ranked according to how well each pattern pi expresses the relations between X and Y.
• For example, X =ostrich and Y =bird– X is the largest Y and Y such as X
• An unsupervised learning algorithm:– Mining large text corpora for patterns <p1,…pm> – The patterns are sorted by pertinence
![Page 3: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/3.jpg)
3
Introduction
• Hearst (1992): Y such as the X – X is a hyponym (type) of Y– For building a thesaurus
• Berland and Charniak (1999) : Y’s X and X of the Y– X is a meronym (part) of Y– For building a lexicon or ontology , like WordNet
• This paper inverse of this problem: – Given a word pair X : Y with some unspecified semantic relatio
ns– Mining a large text corpus for lexico-syntactic patterns to expre
ss the implicit relations between X and Y.
![Page 4: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/4.jpg)
4
Introduction
• A corpus of web pages : 5*1010 English words– From co-occurrences of the pair ostrich: bird in this
corpus• 516 patterns of the form “X … Y”
• 452 patterns of the form “Y … X”
• Main challenge:– To find a way of ranking the patterns– To find a way to empirically evaluate the performance
![Page 5: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/5.jpg)
5
Pertinence - 1/3
• mason:stone vs. carpenter:wood – high degree of relational similarity
• Assumption: – There is a measure of the relational similarity between p
airs of words, simr (X1 :Y1, X2 :Y2 ) . – Let W={X1 :Y1 ,.., X n :Yn} : be a set of word pairs – Let P={P1,..,Pm} : be a set of patterns.
• The pertinence of pattern Pi to a word pair X j :Yj is the expected relational similarity between a word pair Xk :Yk
![Page 6: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/6.jpg)
6
Pertinence - 2/3•
• Let fk ,i be a number of occurrences
– the word pair Xk :Yk with the pattern Pi
•
•
conditional probability relational similarity
![Page 7: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/7.jpg)
7
Pertinence - 3/3
• assume p(X j :Yj ) =1/n for all pairs in W
p(X j :Yj ) =1/n : Laplace smoothing
![Page 8: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/8.jpg)
8
The Algorithm
• Goal: – Input a set of word pairs W={X1:Y1,…,Xn:Yn}
– Output ranked lists of patterns <p1,…pm> for each input pair
• 1. Find phrases:– Corpus: 5*1010 English words
– List of the phrases that begin with Xi and end with Yi
– And, list for the opposite order
– One to three intervening words between Xi and Yi
![Page 9: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/9.jpg)
9
The Algorithm
– The first and last words in the phrase do not need to exactly match Xi and Yi (allowable different suffixes)
• 2. Generate patterns:– For example, the phrase “carpenter nails the wood”
• X nails the Y
• X nails * Y
• X * the Y
• X * * Y
– Xi first and Yi last or vice versa• Do not allow duplicate patterns in a list.
• Pattern frequency (term frequency in IR)
![Page 10: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/10.jpg)
10
The Algorithm
• 3. Count pair frequency:– Pair frequency (document frequency in IR) for a
pattern is the number of lists contain the given pattern.
• 4. Map pairs to rows:– For each pair Xi : Yi , create a row for Xi : Yi and
another row for Yi : Xi
• 5. Map patterns to columns:– For each unique pattern of the form “X…Y” (in step2),
create a column and another column X and Y swapped, ”Y .. X”
![Page 11: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/11.jpg)
11
The Algorithm
• 6. Build a sparse matrix:– Build a matrix X.
• value xij is the pattern frequency of the j-th patterns for the i-th word pair.
• 7. Calculate entropy:– log(xij) * H(P)
• H(P)= H(X) = - xX p(x)log2p(x)
• 8: Apply SVD (singular value decomposition):– SVD is used to reduce noise and compensate for spars
eness
![Page 12: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/12.jpg)
12
The Algorithm
– X = UVT , • U,V are in column orthonormal form is a diagonal matrix of singular value
• If X is of rank r, then is also rank r.
• Let k (k < r) be the diagonal matrix formed from top k singular values
• Let Uk and Vk be the matrices produced by selecting the corresponding columns from U and V.
• K = 300
![Page 13: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/13.jpg)
13
The Algorithm
• 9. Calculate cosines:– simr (X1 :Y1, X2 :Y2 ) is given by the cosine of the angle
between their corresponding row vectors in the matrix UkkVk
• 10. Calculate conditional probabilities:– Using Bayes’ theorem and the raw frequency data
• 11. Calculate pertinence:
![Page 14: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/14.jpg)
14
Experiments with Word Analogies
• 374 college-level SAT test – word pair: ostrich: bird
• (a) lion:cat (b) goose:flock (c) ewe:sheep (d) cub:bear (e) primate:monkey
– Row: 374*6*2=4488 • Drop some pairs they do not co-occur in the corpus.
• 4191 rows
– Column:• 1,706,845 patterns (3,413,690 columns)
• Drop all patterns with a frequency less than ten.
• 42,032 patterns (84,064 columns)
– density is 0.91%
![Page 15: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/15.jpg)
15
![Page 16: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/16.jpg)
16
![Page 17: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/17.jpg)
17
Skip 15 SAT questions
f: pattern frequency
F: maximun f
n: pair frequency
N: total number of word pairs
![Page 18: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/18.jpg)
18
Experiments with Noun-Modifiers-1/3
• 600 noun-modifiers set• 5 general classes of labels with 30 subclasses
– flu virus : causality relation (the flu is caused by a virus)
– causality (storm cloud), temporality (daily exercise), spatial (desert storm), participant (student protest), and quality (expensive book)
• Matrix:– 1184 rows and 33,698 columns– density is 2.57%
![Page 19: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/19.jpg)
19
Experiments with Noun-Modifiers-2/3
• leave-one-out cross-validation– the testing set consists of a single noun-modifier pair a
nd the training set consists of the 599 remaining noun-modifiers.
![Page 20: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/20.jpg)
20
Experiments with Noun-Modifiers-3/3
![Page 21: Expressing Implicit Semantic Relations without Supervision ACL 2006](https://reader030.vdocuments.site/reader030/viewer/2022032414/56649ee65503460f94bf728e/html5/thumbnails/21.jpg)
21
Conclusion
• How word pairs are similar
• The main contribution of this paper is the idea of pertinence
• Although the performance on the SAT analogy questions (54.6%) is near the level of the average senior high school student (57%), there is room for improvement.