01/06/15sergey chernov 1 extracting semantic relationships between wikipedia categories by sergey...

12
March 27, 2022 Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal Kopycki, Przemyslaw Rys

Post on 18-Dec-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

1

Extracting Semantic Relationships between Wikipedia Categories

By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal Kopycki, Przemyslaw Rys

Page 2: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

2

Preliminaries

WIKIPEDIA: largest knowledge sharing system

Many pages assigned to CATEGORIES

All links are NAVIGATIONAL

Can we extract SEMANTIC links?

MOTIVATION

Page 3: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

3

Wikipedia Categories ExampleMOTIVATION

Page 4: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

4

Possible benefits

Semi-structured queries“find Countries which had Democratic Non-Violent Revolutions”

rephrased as

“find page from category Countries which is connected to some page in Non-Violent Revolutions”

Hints for authors

“you edit page from category Countries, do you want to add a link to page in category Capital?”

Raw data for manual semantic markup

MOTIVATION

Page 5: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

5

Countries

HeuristicsExperiments

Denmark

Austria

CapitalsBerlin

Stockholm

Vienna

Germany

France Paris

Number of links

NL = 3

Connectivity Ratio

CR = 3/4 = 0.75

Page 6: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

6

Dataset

INEX 2006 collection

Sample category rankings

Experiments

Page 7: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

7

Manual assessment methodology

Semantic Connection Strength (SCS) Measure: 2 = strong semantic relationship, 1 = average semantic relationship, 0 = weak or no semantic relationship.

Instruction for Assessors

“category A is strongly related to category B (value 2) if you believe that every page in A should conceptually have at least one semantic link to B;”

“A and B are averagely related (value 1), if you believe 50% of pages in A should have semantic links to B;”

“otherwise, A and B are weakly related (value 0).”

Page 8: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

8

Experiments with Number of Links

Average semantic connections strength for 100 sample categories, extracted using Number of Links.

Experiments

Page 9: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

9

Experiments with Connectivity Ratio

Average semantic connections strength for 100 sample categories, extracted using Connectivity Ratio.

Experiments

Page 10: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

10

General Results and Conclusions

Result is skewed toward Countries category

Connectivity Ratio is a better measure than Number of Links

We have observed that inlinks have better performance than outlinks.

Summary

Page 11: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

11

Future Steps

More manual exploration, look for additional heuristics

Consider more categories

SCS composed of

Is this a “part of” relation? W1 Is this a “is a” relation? W2 Is this a “synonym” relation? W3 Is this a “antonym” relation? W4 It is related in a different way? Which one? W5

Summary

Page 12: 01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal

April 18, 2023Sergey Chernov

12

Thank You!