neural networks for information retrievalnn4ir.com/wsdm2018/slides/06_entities.pdf172 entities...

36
157 Outline Morning program Preliminaries Modeling user behavior Semantic matching Learning to rank Afternoon program Entities Generating responses Recommender systems Industry insights Q&A

Upload: trandat

Post on 11-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

157

Outline

Morning programPreliminariesModeling user behaviorSemantic matchingLearning to rank

Afternoon programEntitiesGenerating responsesRecommender systemsIndustry insightsQ & A

158

EntitiesEntities are polysemic

“Finding entities” has multiple meanings.

Entities can be

I nodes in knowledge graphs,

I mentions in unstructured texts or queries,

I retrievable items characterized by texts.

159

Outline

Morning programPreliminariesModeling user behaviorSemantic matchingLearning to rank

Afternoon programEntities

Knowledge graph embeddingsEntity mentions in unstructured textEntity finding

Generating responsesRecommender systemsIndustry insightsQ & A

160

EntitiesKnowledge graphs

Beyoncé Knowles

Destiny’s Child

member of

Kelly Rowland

Michelle Williams

member of

member of

1997start date

2005end date

Triples(beyonce knowles, member of, destinys child)(kelly rowland, member of, destinys child)(michelle williams, member of, destinys child)(destinys child, start date, 1997)(destinys child, end date, 2005)

...

Nice overview on using knowledge bases in IR: [Dietz et al., 2017]

161

EntitiesKnowlegde graphs

Tasks

I Link predictionPredict the missing h or t for a triple (h, r, t)

Rank entities by score. Metrics:

I Mean rank of correct entityI Hits@10

Datatsets

WordNet(car, hyponym, vehicle)

Freebase/DBPedia(steve jobs, founder of, apple)

I Triple classificationPredict if (h, r, t) is correct.Metric: accuracy.

I Relation fact extraction from free textUse knowledge base as weak supervision for extracting new triples.Suppose some IE system gives us (steve jobs, ‘‘was the initiator of’’,

apple), then we want to predict the founder of relation.

162

EntitiesKnowlegde graphs

Knowledge graph embeddings

I TransE [Bordes et al., 2013]

I TransH [Wang et al., 2014]

I TransR [Lin et al., 2015]

163

EntitiesTransE

“Translation intuition”

For a triple (h, l, t) : ~h + ~l ⇡ ~t.

ti

ll

tj

hi hj

164

EntitiesTransE

“Translation intuition”

For a triple (h, l, t) : ~h + ~l ⇡ ~t.

positiveexamples

negativeexamples

distance function

[Bordes et al., 2013]

165

EntitiesTransE

“Translation intuition”

For a triple (h, l, t) : ~h + ~l ⇡ ~t.

How about:

I one-to-many relations?

I many-to-many relations?

I many-to-one relations?

ti

rr

tj

hi hj

ti

r??

r??

hi

r??r??

tj

hi hj

tj

ti

166

EntitiesTransH

[Wang et al., 2014]

167

EntitiesTransH

distance function

[Wang et al., 2014]

168

EntitiesTransH

i.e., translation vector dris in the hyperplaneConstraints

soft constraints

[Wang et al., 2014]

169

EntitiesTransR

Use di↵erent embedding spaces for entities and relationsI 1 entity spaceI multiple relation spacesI perform translation in appropriate relation space

[Lin et al., 2015]

170

EntitiesTransR

[Lin et al., 2015]

171

EntitiesTransR

Relations: Rd

Entities: Rk

Mr = projection matrix: k * d

Constraints:

[Lin et al., 2015]

172

EntitiesChallenges

I How about time?E.g., some relations hold from a certain date, until a certain date.

I New entities/relationships

I Finding synonymous relationships/duplicate entities (2005, end date,

destinys child) (destinys child, disband, 2005) (destinys child,

last performance, 2005)

I EvaluationLink prediction? Relation classification? Is this fair? As in, is this even possible inall cases (for a human without any world knowledge)?

173

EntitiesResources: toolkits + knowledge bases

Source Code

KB2E : https://github.com/thunlp/KB2E [Lin et al., 2015]

TransE : https://everest.hds.utc.fr/

Knowledge Graphs

I Google Knowledge Graphgoogle.com/insidesearch/features/search/knowledge.html

I Freebasefreebase.com

I GeneOntologygeneontology.org

I WikiLinkscode.google.com/p/wiki-links

174

Outline

Morning programPreliminariesModeling user behaviorSemantic matchingLearning to rank

Afternoon programEntities

Knowledge graph embeddingsEntity mentions in unstructured textEntity finding

Generating responsesRecommender systemsIndustry insightsQ & A

175

EntitiesEntity mentions

Recognition Detect mentions within unstructured text (e.g., query).

Linking Link mentions to knowledge graph entities.

Utilization Use mentions to improve search.

176

EntitiesNamed entity recognition

yB−ORG O B−MISC O

rejects German callEU

x

h EU rejects German call to boycott British lamb .

B-ORG O B-MISC O O O B-MISC O O

Task vanilla RNN

177

EntitiesNamed entity recognition

I A Unified Architecture for NaturalLanguage Processing: Deep NeuralNetworks with Multitask Learning[Collobert and Weston, 2008]

I Natural Language Processing (Almost)from Scratch [Collobert et al., 2011]

Learning a single model to solve multiple NLPtasks. Taken from [Collobert and Weston,2008].

Feed-forward language model architecture fordi↵erent NLP tasks. Taken from [Collobert andWeston, 2008].

178

EntitiesNamed entity recognition

O

forward

backward

EU rejects German call

OB−ORG B−MISC

BI-LSTM-CRF model

CRF CRF CRF

[Huang et al., 2015]

179

EntitiesEntity disambiguation

I Learn representations for documentsand entities.

I Optimize a distribution of candidateentities given a document using (a)cross entropy or (b) pairwise loss.

Learn initial document representation inunsupervised pre-training stage. Taken from[He et al., 2013].

Learn similarity between document and entityrepresentations using supervision. Taken from[He et al., 2013].

180

EntitiesEntity linking

Learn representations for the context, the mention, the entity (using surface words) and theentity class. Uses pre-trained word2vec embeddings. Taken from [Sun et al., 2015].

181

EntitiesEntity linking

Encode Wikipedia descriptions, linked mentions in Wikipedia and fine-grained entity types. Allrepresentations are optimized jointly. Taken from [Gupta et al., 2017].

182

EntitiesEntity linking

A single mention phrase refers to various entities. Multi-Prototype Mention Embedding modelthat learns multiple sense embeddings for each mention by jointly modeling words from textualcontexts and entities derived from a KB. Taken from [Cao et al., 2017].

183

EntitiesImproving search using linked entities

Attention-based ranking model for word-entity duet. Learn a similarity between query anddocument entities. Resulting model can be used to obtain ranking signal. Taken from [Xionget al., 2017a].

184

Outline

Morning programPreliminariesModeling user behaviorSemantic matchingLearning to rank

Afternoon programEntities

Knowledge graph embeddingsEntity mentions in unstructured textEntity finding

Generating responsesRecommender systemsIndustry insightsQ & A

185

EntitiesEntity finding

Task definitionRank entities satisfying a topic described by a few query terms.

Not just document search — (a) topics do not typically correspond to entity names,(b) average textual description much longer than typical document.

Di↵erent instantiations of the task within varying domains:

I Wikipedia: INEX Entity Ranking Track [de Vries et al., 2007, Demartini et al.,2008, 2009, 2010] (lots of text, knowledge graph, revisions)

I Enterprise search: expert finding [Balog et al., 2006, 2012] (few entities,abundance of text per entity)

I E-commerce: product ranking [Rowley, 2000] (noisy text, customer preferences)

186

EntitiesSemantic Expertise Retrieval [Van Gysel et al., 2016]

I Expert finding is a particular entity retrieval task where there is a lot of text.

I Learn representations of words and entities such that n-grams extracted from adocument predict the correct expert.

Taken from slides of Van Gysel et al. [2016].

187

EntitiesSemantic Expertise Retrieval [Van Gysel et al., 2016] (cont’d)

I Expert finding is a particular entity retrieval task where there is a lot of text.

I Learn representations of words and entities such that n-grams extracted from adocument predict the correct expert.

Taken from slides of Van Gysel et al. [2016].

188

EntitiesRegularities in Text-based Entity Vector Spaces [Van Gysel et al., 2017c]

To what extent do entity representation models, trained only on text, encode structuralregularities of the entity’s domain?

Goal: give insight into learned entity representations.

I Clusterings of experts correlate somewhat with groups that exist in the real world.

I Some representation methods encode co-authorship information into their vectorspace.

I Rank within organizations is learned (e.g., Professors > PhD students) as seniorpeople typically have more published works.

189

EntitiesLatent Semantic Entities [Van Gysel et al., 2016]

I Learn representations of e-commerce products and query terms for product search.

I Tackles learning objective scalability limitations from previous work.

I Useful as a semantic feature within a Learning To Rank model in addition to alexical matching signal.

Taken from slides of Van Gysel et al. [2016].

190

EntitiesPersonalized Product Search [Ai et al., 2017]

I Learn representations of e-commerceproducts, query terms, and users forpersonalized e-commerce search.

I Mixes supervised (relevance triples ofquery, user and product) andunsupervised (language modeling)objectives.

I The query is represented as aninterpolation of query term and userrepresentations.

Personalized product search in a latent spacewith query ~q, user ~u and product item ~i. Takenfrom Ai et al. [2017].

191

EntitiesResources: toolkits

SERT : http://www.github.com/cvangysel/SERT [Van Gysel et al., 2017b]

HEM : https://ciir.cs.umass.edu/downloads/HEM [Ai et al., 2017]

192

EntitiesResources: further reading on entities/KGs

For more information, see the tutorial on “Utilizing Knowledge Graphs in Text-centricInformation Retrieval” [Dietz et al., 2017] presented at last year’s WSDM.

https://github.com/laura-dietz/tutorial-utilizing-kg