a word at a time computing word relatedness using temporal semantic analysis kira radinsky, eugene...
TRANSCRIPT
A Word at a TimeComputing Word Relatedness using Temporal Semantic Analysis
Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch
Introduction
• A rich source of information can be revealed by studying the patterns of word occurrence over time
• Example: “peace” and “war”
• Corpus: New York Times over 130 years• Word <=> time series of its occurrence in NYT articles
• Hypothesis: Correlation between 2 words time series Semantic Relation
• Proposed method: Temporal Semantic Analysis (TSA)
Introduction
Introduction
1. TSA
Temporal Semantic Analysis
• 3 main steps:
1. Represent words as concepts vectors
2. Extract temporal dynamics for each concept
3. Extend static representation with temporal dynamics
1. Words as concept vectors
2. Temporal dynamics
• c : concept represented by a sequence of words wc1,…,wck
• d : a document• ε : proximity relaxation parameter (ε = 20 in the
experiments)• c appears in d if its words appear in d with a distance of
at most ε words between each pair wci, wcj
• Example: “Great Fire of London”
2. Temporal dynamics
• t1,…,tn : a sequence of consecutive discrete time points (days)
• H = D1,…,Dn : history represented by a set of document collections, where Di is a collection of documents associated with time ti
• the dynamics of a concept c is the time series of its frequency of appearance in H
3. Extend static representation
2. Using TSA for computing Semantic
Relatedness
Using TSA for computing Semantic Relatedness
• Compare by weighted distance between time series of concept vectors
• Combine it with the static semantic similarity measure
Algorithm
• t1, t2 : words• C(t1) = {c1,…,cn}and C(t2) = {c1,…,cm}: sets of concepts
of t1 and t2
• Q(c1,c2) : function that determines relatedness between two concepts c1 and c2 using their dynamics (time series)
Algorithm
Cross Correlation
• Pearson's product-moment coefficient:
• A statistic method for measuring similarity of two random variables
• Example: “computer” and “radio”
Dynamic Time Warping
• Measure similarity between 2 time series that may differ in time scale but similar in shape
• Used in speech recognition
• It defines a local cost matrix
• Temporal Weighting Function
3. Experimentation
s
Experimentations: Setup
• New York Times archive (1863 – 2004) Each day: average of 50 abstracts of article 1.42 Gb of texts 565 540 distinct words
• A new algorithm to automatically benchmark word relatedness tasks
• Same vector representation for each method tested
• Comparison to human judgment (WS-353 and Amazon MTurk)
TSA vs. ESA
TSA vs. Temporal Word Similarity
Word Frequency Effects
Size of Temporal Concept Vector
Conclusion
• Two innovations:o Temporal Semantic Analysiso A new method for measuring semantic relatedness of terms
• Many advantages (robustness, tunable, can be used to study language evolution over time)
• Significant improvements in computing words relatedness