presenter: ziqi zhang oak research group, department of computer science, university of sheffield

26
A Random Graph Walk based Approach to Computing Semantic Relatedness Using Knowledge from Wikipedia Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield Authors: Ziqi Zhang, Anna Lisa Gentile, Lei Xia, José Iria, Sam Chapman

Upload: yardley

Post on 09-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

A Random Graph Walk based Approach to Computing Semantic Relatedness Using Knowledge from Wikipedia. Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield Authors: Ziqi Zhang, Anna Lisa Gentile, Lei Xia, José Iria, Sam Chapman. In this presentation…. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

A Random Graph Walk based Approach to Computing Semantic

Relatedness Using Knowledge from Wikipedia

Presenter: Ziqi ZhangOAK Research Group, Department of Computer Science,

University of SheffieldAuthors: Ziqi Zhang, Anna Lisa Gentile, Lei Xia, José Iria,

Sam Chapman

Page 2: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

• Introduction to semantic relatedness

• Motivation to this research

• Methodology: random walk, Wikipedia, semantic relatedness

• Experiment and Evaluation: computing semantic relatedness, semantic relatedness for named entity disambiguation

In this presentation…In this presentation…

Page 3: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

• Semantic relatedness (SR) measures how much words or concepts are related by encompassing all kinds of relations between them

Semantic RelatednessSemantic Relatedness > Introduction

LREC

computer science

Malta

ACL

COLING

computational linguistics

Volcano ashes

Airline

??

• It captures broader sense than semantic similarity• It enables many complex NLP tasks, e.g., sense

disambiguation, lexicon construction

Page 4: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

• Typically, lexical resources (e.g., WordNet, Wikipedia) are needed to provide structural and content information about concepts

Method and LiteratureMethod and Literature > Introduction

• Relatedness is computed by aggregating and balancing these “semantic” elements using mathematical formula

• Some best known works: Resnik (1995), Leacock & Chodorow (1998), Strube & Ponzetto (2006), Zesch et al. (2008), Gabrilovich & Markovitch (2007)

• Recent trend: towards using collaborative lexical resources, such as Wikipedia, Wiktionary

Page 5: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

• Wikipedia contains rich and diverse structural and content information about concepts and entities

Another SR measure, why?Another SR measure, why? > Motivation

TitleTitle

RedirectRedirect

Content words

Content words

LinksLinks

ListsLists

InfoboxInfobox

CategoryCategory

On a Wiki page:

• Which are useful for SR? Which are more useful than others?

• Can we combine them?• How to combine them?• Can we gain more if we combine them?

Page 6: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

This paper aims to answer these questionsby

The ResearchThe Research > Motivation

Proposing a method that naturally integrates diverse features in a balanced way, and

studying the importance of different features

Page 7: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Overview of the methodOverview of the method> Methodology

“NLP”

“ComputationalLinguistics”

WikiPage

Retrieval

WikiPage

Retrieval

FeatureExtraction

FeatureExtraction

“NLP” F. 1F. 1

F. 2F. 2F. 3F. 3

weight=x

weight=y

weight=z

Random Walk

Random Walk

“NLP”

F. 1F. 1

F. 3F. 3

F. 2F. 2

“ComputationalLinguistics”

F’. 1

F’. 1

F’. 3

F’. 3

F’. 2

F’. 2

Rel?

Page 8: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Wiki page retrievalWiki page retrieval> Methodology

• Objective: given two words/phrases, find the corresponding information pages from Wikipedia that they refer to

“ComputationalLinguistics”

WikiPage

Retrieval

WikiPage

Retrieval

“NLP”

• Problem: Ambiguities of input words (surface)• Solution: Collect all pages (sense page), compute pair-wise

relatedness between all senses, choose the pair with maximum score

Natural Language

Processing

Natural Language

Processing

National Liberal Party

National Liberal Party

Computational

Linguistics (science)

Computational

Linguistics (science)

Computational

Linguistics (journal)

Computational

Linguistics (journal)

Page 9: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Feature ExtractionFeature Extraction> Methodology

• Objective: identify useful features to represent each sense of a surface for algorithmic consumption

• Page title and redirect target• Content words from the first section; or top n frequent

words from the entire page

• Page categories (search depth = 2)

• Outgoing link target in list structure

• Other outgoing link targets

• Descriptive/Definitive noun (first noun phrase after be in the first sentence)

• All features formulated at word-level

Page 10: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

has_category

Page 11: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

• Intuition: a walker takes n steps, in each step a random route is taken

has_category

Page 12: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

has_category

• Intuition: a walker takes n steps, in each step a random route is taken

Page 13: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

has_category

• Intuition: starting from a node, in n step, one can reach a limited set of other nodes.

Page 14: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

has_category

• Intuition: the more routes connecting the desired end nodes, and the more likely the routes are taken, the more relevant two senses are

Page 15: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

has_category

• Intuition: the more routes connecting the desired end nodes, and the more likely the routes are taken, the more relevant two senses are

Routes are established by

feature extraction and graph

construction

Routes are established by

feature extraction and graph

construction

Page 16: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – Graph ConstructionRandom Walk – Graph Construction> Methodology

• Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features

“Natural Language Processing”

T1T1L3L3

C1C1

“Computational Linguistics (science)”

has_title

has_category

has_category

has_link

has_link

has_link

C2C2

L2L2

L1L1

T2T2

C3C3

has_title

has_category L4L4

has_link

L5L5

has_link

has_link

has_category

• Intuition: the more routes connecting the desired end nodes, and the more likely the routes are taken, the more relevant two senses are

“Likelihood” is modelled by

importance of each type of feature, and

to be studied by experiments

“Likelihood” is modelled by

importance of each type of feature, and

to be studied by experiments

Page 17: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Random Walk – The MathRandom Walk – The Math> Methodology

• Random walk is simulated via matrix calculation and transformation

• Adjacency matrix modelling distribution of weights for different features

• T-step random walk is achieved by matrix calculation

• Translating probability to relatedness

Page 18: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Experiment & EvaluationExperiment & Evaluation> Experiment

• The experiments are designed to achieve three objectives– Analyse the importance of each proposed feature – Evaluate effectiveness of the random walk method for

computing semantic relatedness– Evaluate the usefulness of the method for solving other NLP

problems – Named Entity Disambiguation (NED)

Page 19: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Feature AnalysisFeature Analysis> Experiment

• Simulated Annealing optimisation (Nie et al., 2005) method is used to perform the analysis, in which– 200 pair of words from WordSim353 is used– To begin with, we treat each feature equally by assigning same

weights (weight model)– Compute SR using the weight model, and evaluate against the

gold standard– Hundreds of iterations are run, in each turn, different weight

model is generated randomly– Manually analysing the weight model that contribute to the

highest performance achieved on this dataset, eliminating least important features or combining them into other features that are semantically similar

Page 20: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Feature Analysis - findingsFeature Analysis - findings> Experiment

Weight Feature

0.166 Title (incl. redirect target)

0.166 First section words

0.166 Categories

0.166 Descriptive nouns

0.166 Out links in lists

0.166 Other out links

Achieved best accuracy of 0.45 on the data, compared to best in the literature of 0.5 by Zesch et al. (2008)

Page 21: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Feature Analysis - findingsFeature Analysis - findings> Experiment

Weight Feature

0.166 Title (incl. redirect target)

0.166 First section words

0.166 Categories

0.166 Descriptive nouns

0.166 Out links in lists

0.166 Other out links

This setting is then used for further evaluation

Page 22: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Evaluating Computation of SREvaluating Computation of SR> Experiment

• Three datasets are chosen: different set of 153 pairs of words from WordSim353; 65 pairs from Rubenstein &Goodenough (1965), RG65; 30 pairs from Miller & Charles (1991), MC30

• Compared against: a collection of WordNet-based algorithms and other state-of-the-art methods for SR

WordSim353-153

RG65 MC30 WordSim353 -200 feature analysis

Ours 0.71 0.76 0.71 0.46

Strube & Ponzetto (2006) 0.55 0.69 0.67 /

Zesch et al. ESA (2008) 0.62 / / 0.31

Zesch et al. Wiki (2008) 0.7 0.76 0.68 0.5

Zesch et al. Wiktionary (2008) 0.7 0.84 0.84 0.6

Best of WordNet 0.39 0.79 0.81 0.23

Page 23: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Evaluating Usefulness of SR for NEDEvaluating Usefulness of SR for NED> Experiment

• The NED method in a nutshell (Details: Gentile et al., 2009)• Identify surfaces of NEs that occur in a text passage and

that are defined by Wikipedia, retrieve corresponding sense pages

• Computing SR of each pair of their underlying senses • The sense of a surface is determined collectively by the

senses of other surfaces found in the text (contexts)• Three functions are defined to capture this collective

context

Page 24: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Evaluating Usefulness of SR for NEDEvaluating Usefulness of SR for NED> Experiment

• Dataset: 20 news stories by Cucerzan (2007), each story contains 10 – 50 NEs

Accuracy

Our best 91.5

Our baseline 68.7

Cucerzan baseline 51.7

Curcerzan best 91.4

Page 25: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

ConclusionConclusion• Computing SR isn’t an easy task

• Different structural and content information in Wikipedia all contribute to the task, but in different weights

• Combining these different features in a uniform measure can improve performance

• Can we use simpler similarity functions to obtain same results?

• Can we integrate different lexical resources?• How to compute relatedness/similarity of longer text

passages?

In future

Page 26: Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield

Thank you!Thank you!

• Cucerzan, S. (2007). Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In EMNLP’07• Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., and Ruppin, E. (2002). Placing search

in context: the concept revisited. In ACM Transactions on Information Systems, 20 (1), pp. 116 – 131 • Gabrilovich, E., Markovitch, S. (2007). Computing semantic relatedness using Wikipedia-based explicit

semantic analysis. In Proceedings of IJCAI’07, pp. 1606-1611 • Gentile, A., Zhang, Z., Xia, L., Iria, J. (2009). Graph-based semantic relatedness for named entity

disambiguation. In S3T• Leacock, C., Chodorow, M. (1998). Combining local context and WordNet similarity for word sense

identification. In C. Fellbaum (Ed.), WordNet. An Electronic Lexical Database, Chp. 11, pp. 265-283.• Miller, G., Charles, W. (1991). Contextual correlates of semantic similarity. In Language and Cognitive

Processes, 6(1): 1-28• Nie, Z., Zhang, Y., Wen, J., Ma, W. (2005). Object-level ranking: bringing order to web objects. In Proceedings of

WWW’05• Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of

IJCAI-95, pp. 448-453 • Rubenstein, H., Goodenough, J. (1965). Contextual correlates of synonymy. In Communications of the ACM,

8(10):627-633 • Strube, M., Ponzetto, S. (2006). WikiRelate! Computing semantic relatedness using Wikipedia. In AAAI’06• Zesch, T., Müller, C., Gurevych, I. (2008). Using Wiktionary for computing semantic relatedness. In Proceedings

of AAAI’08

References (complete list can be found in paper)