end-to-end neural information retrieval · table of contents 1 introduction 2 related work 3...

63
End-to-end Neural Information Retrieval MMath thesis Wei Yang Cheriton School of Computer Science University of Waterloo April 2019 Wei Yang End-to-end Neural Information Retrieval 1 / 29

Upload: others

Post on 17-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

End-to-end Neural Information RetrievalMMath thesis

Wei Yang

Cheriton School of Computer ScienceUniversity of Waterloo

April 2019

Wei Yang End-to-end Neural Information Retrieval 1 / 29

Page 2: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Table of Contents

1 Introduction

2 Related Work

3 End-to-end Neural Information Retrieval Architecture

4 Experiments

5 Conclusion and Discussion

Wei Yang End-to-end Neural Information Retrieval 2 / 29

Page 3: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Table of Contents

1 Introduction

2 Related Work

3 End-to-end Neural Information Retrieval Architecture

4 Experiments

5 Conclusion and Discussion

Wei Yang End-to-end Neural Information Retrieval 3 / 29

Page 4: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Types of NLP Tasks

Sequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generation

Wei Yang End-to-end Neural Information Retrieval 4 / 29

Page 5: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Types of NLP Tasks

Sequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generation

Wei Yang End-to-end Neural Information Retrieval 4 / 29

Page 6: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Types of NLP Tasks

Sequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generation

Wei Yang End-to-end Neural Information Retrieval 4 / 29

Page 7: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Types of NLP Tasks

Sequence classificationSequence pair classification (text matching)Sequence labelingSequence-to-sequence generation

Wei Yang End-to-end Neural Information Retrieval 4 / 29

Page 8: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Information Retrieval

Tasks Text 1 Text 2 ObjectiveParaphrase Indentification string 1 string 2 C

Textual Entailment text hypothesis CQuestion Answering question answer C/R

Conversation dialog response C/RInformation Retrieval query document R

Table: Typical text matching tasks (C: classification; R: ranking)

Wei Yang End-to-end Neural Information Retrieval 5 / 29

Page 9: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Problem Definition

Search microblogs:Query: 2022 fifa soccerRelevant document: #ps3 best sellers: fifa soccer 11 ps3#cheaptweet https://www.amazon.com/fifa-soccer-11-playstation-3

Search newswire articles:Query: international organized crimeRelevant document: The past few years have been characterized byan unprecedented growth in crime, changes in its characteristics, andfor all practical purposes the loss of state and public control over thecrime situation...More than 40 international smuggling crime groups have beenidentified. More than 130 ”Russian” stores selling Russian antiqueshave been found abroad...

Wei Yang End-to-end Neural Information Retrieval 6 / 29

Page 10: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Problem Definition

Search microblogs:Query: 2022 fifa soccerRelevant document: #ps3 best sellers: fifa soccer 11 ps3#cheaptweet https://www.amazon.com/fifa-soccer-11-playstation-3

Search newswire articles:Query: international organized crimeRelevant document: The past few years have been characterized byan unprecedented growth in crime, changes in its characteristics, andfor all practical purposes the loss of state and public control over thecrime situation...More than 40 international smuggling crime groups have beenidentified. More than 130 ”Russian” stores selling Russian antiqueshave been found abroad...

Wei Yang End-to-end Neural Information Retrieval 6 / 29

Page 11: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Why Neural Network

Why NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developments

Why NN for IR?Relevance judgments are based on a complicated human cognitiveprocess

Wei Yang End-to-end Neural Information Retrieval 7 / 29

Page 12: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Why Neural Network

Why NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developments

Why NN for IR?Relevance judgments are based on a complicated human cognitiveprocess

Wei Yang End-to-end Neural Information Retrieval 7 / 29

Page 13: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Why Neural Network

Why NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developments

Why NN for IR?Relevance judgments are based on a complicated human cognitiveprocess

Wei Yang End-to-end Neural Information Retrieval 7 / 29

Page 14: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Why Neural Network

Why NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developments

Why NN for IR?Relevance judgments are based on a complicated human cognitiveprocess

Wei Yang End-to-end Neural Information Retrieval 7 / 29

Page 15: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Why Neural Network

Why NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developments

Why NN for IR?Relevance judgments are based on a complicated human cognitiveprocess

Wei Yang End-to-end Neural Information Retrieval 7 / 29

Page 16: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Why Neural Network

Why NN for NLP?Low-dimensional semantic spaceThousands of variationsHierarchical structureHardware developments

Why NN for IR?Relevance judgments are based on a complicated human cognitiveprocess

Wei Yang End-to-end Neural Information Retrieval 7 / 29

Page 17: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Table of Contents

1 Introduction

2 Related Work

3 End-to-end Neural Information Retrieval Architecture

4 Experiments

5 Conclusion and Discussion

Wei Yang End-to-end Neural Information Retrieval 8 / 29

Page 18: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 19: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 20: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 21: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 22: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 23: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 24: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Related Work

2009

2009

2013

2014

2016

2017

2018

before 2009: vector space models and probabilistic models (Querylikelihood (QL), BM25, RM3...)2009: Learning to rank models (tens of hand-crafted features)2013: Deep Structured Semantic Model (DSSM)2014: CDSSM, ARC-I, ARC-II (mainly for short text ranking)2016: MatchPyramid, DRMM ...2017: KNRM, DUET, DeepRank, PACRR ...2018: HINT, MP-HCNN (hierachical matching patterns) ...

Wei Yang End-to-end Neural Information Retrieval 9 / 29

Page 25: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Contribution

An end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.

Wei Yang End-to-end Neural Information Retrieval 10 / 29

Page 26: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Contribution

An end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.

Wei Yang End-to-end Neural Information Retrieval 10 / 29

Page 27: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Contribution

An end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.

Wei Yang End-to-end Neural Information Retrieval 10 / 29

Page 28: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Contribution

An end-to-end retrieval and reranking system to allow the user toapply different retrieval models and neural reranking models ondifferent datasets.State-of-the-art performance on two benchmark datasets (Robust04and Microblog) for document retrieval.Prove the effectiveness and additivity of a strong baseline for neuralreranking methods.Co-design the MP-HCNN model for social media post searching.

Wei Yang End-to-end Neural Information Retrieval 10 / 29

Page 29: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Table of Contents

1 Introduction

2 Related Work

3 End-to-end Neural Information Retrieval Architecture

4 Experiments

5 Conclusion and Discussion

Wei Yang End-to-end Neural Information Retrieval 11 / 29

Page 30: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Architecture

Anserini

InvertedIndex

Query

top k1 documents

Top k2 documentsTrained model +

Reranker

retrieval score

rerankscore

Raw model

Indexing Train reranker

Corpus

Figure: The architecture of the Retrieval-rerank framework.

Wei Yang End-to-end Neural Information Retrieval 12 / 29

Page 31: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Retrieval, Rerank and Aggregation

Retrieval: Anserini (QL, QL+RM3, BM25, BM25+RM3)Rerank:

MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERT

Aggregation:

rel(q, d) = λ ∗ Reranker(q, d) + (1− λ) ∗ Retriever(q, d) (1)

Wei Yang End-to-end Neural Information Retrieval 13 / 29

Page 32: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Retrieval, Rerank and Aggregation

Retrieval: Anserini (QL, QL+RM3, BM25, BM25+RM3)Rerank:

MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERT

Aggregation:

rel(q, d) = λ ∗ Reranker(q, d) + (1− λ) ∗ Retriever(q, d) (1)

Wei Yang End-to-end Neural Information Retrieval 13 / 29

Page 33: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Retrieval, Rerank and Aggregation

Retrieval: Anserini (QL, QL+RM3, BM25, BM25+RM3)Rerank:

MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERT

Aggregation:

rel(q, d) = λ ∗ Reranker(q, d) + (1− λ) ∗ Retriever(q, d) (1)

Wei Yang End-to-end Neural Information Retrieval 13 / 29

Page 34: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Retrieval, Rerank and Aggregation

Retrieval: Anserini (QL, QL+RM3, BM25, BM25+RM3)Rerank:

MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERT

Aggregation:

rel(q, d) = λ ∗ Reranker(q, d) + (1− λ) ∗ Retriever(q, d) (1)

Wei Yang End-to-end Neural Information Retrieval 13 / 29

Page 35: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Retrieval, Rerank and Aggregation

Retrieval: Anserini (QL, QL+RM3, BM25, BM25+RM3)Rerank:

MatchZoo models (DSSM, CDSSM, DUET, KNRM, DRMM)MP-HCNNBERT

Aggregation:

rel(q, d) = λ ∗ Reranker(q, d) + (1− λ) ∗ Retriever(q, d) (1)

Wei Yang End-to-end Neural Information Retrieval 13 / 29

Page 36: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

MatchZoo

Figure: Details of five neural information retrieval models in MatchZoo

Wei Yang End-to-end Neural Information Retrieval 14 / 29

Page 37: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

MP-HCNN

Figure: Overview of our Multi-Perspective Hierarchical Convolutional NeuralNetwork (MP-HCNN), which consists of two parallel components for word-leveland character-level modeling between queries, social media posts, and URLs.

Wei Yang End-to-end Neural Information Retrieval 15 / 29

Page 38: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

BERT

Figure: The architecture of the BERT model for text matching.

Wei Yang End-to-end Neural Information Retrieval 16 / 29

Page 39: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Table of Contents

1 Introduction

2 Related Work

3 End-to-end Neural Information Retrieval Architecture

4 Experiments

5 Conclusion and Discussion

Wei Yang End-to-end Neural Information Retrieval 17 / 29

Page 40: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Evaluation Metrics

Precisionq =∑〈i ,d〉∈Rq relq(d)|Rq|

(2)

APq =∑〈i ,d〉∈Rq Precisionq,i × relq(d)∑

d∈D relq(d) (3)

NDCGq = DCGqIDCGq

(4)

DCGq =∑

〈i ,d〉∈Rq

2relq(d) − 1log2(i + 1) (5)

Wei Yang End-to-end Neural Information Retrieval 18 / 29

Page 41: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Evaluation Metrics

Precisionq =∑〈i ,d〉∈Rq relq(d)|Rq|

(2)

APq =∑〈i ,d〉∈Rq Precisionq,i × relq(d)∑

d∈D relq(d) (3)

NDCGq = DCGqIDCGq

(4)

DCGq =∑

〈i ,d〉∈Rq

2relq(d) − 1log2(i + 1) (5)

Wei Yang End-to-end Neural Information Retrieval 18 / 29

Page 42: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Evaluation Metrics

Precisionq =∑〈i ,d〉∈Rq relq(d)|Rq|

(2)

APq =∑〈i ,d〉∈Rq Precisionq,i × relq(d)∑

d∈D relq(d) (3)

NDCGq = DCGqIDCGq

(4)

DCGq =∑

〈i ,d〉∈Rq

2relq(d) − 1log2(i + 1) (5)

Wei Yang End-to-end Neural Information Retrieval 18 / 29

Page 43: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Datasets

Test Set 2011 2012 2013 2014# query topics 49 60 60 55# query-doc pairs 49,000 60,000 60,000 55,000

Table: Statistics of the TREC Microblog Track datasets

# query topics 250# query-doc pairs 250,000

Table: Statistics of the Robust04 datasets

Wei Yang End-to-end Neural Information Retrieval 19 / 29

Page 44: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Datasets

Test Set 2011 2012 2013 2014# query topics 49 60 60 55# query-doc pairs 49,000 60,000 60,000 55,000

Table: Statistics of the TREC Microblog Track datasets

# query topics 250# query-doc pairs 250,000

Table: Statistics of the Robust04 datasets

Wei Yang End-to-end Neural Information Retrieval 19 / 29

Page 45: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Experimental Setup

Train/test splits:Microblog: train on 2011, 2012 and 2013, test on 2014Robust04: five-fold cross validation

Hyper-parameter tuning: 10% of the training dataModels:

Microblog: MatchZoo models, MP-HCNN, BERTRobust04: MatchZoo models

Wei Yang End-to-end Neural Information Retrieval 20 / 29

Page 46: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Experimental Setup

Train/test splits:Microblog: train on 2011, 2012 and 2013, test on 2014Robust04: five-fold cross validation

Hyper-parameter tuning: 10% of the training dataModels:

Microblog: MatchZoo models, MP-HCNN, BERTRobust04: MatchZoo models

Wei Yang End-to-end Neural Information Retrieval 20 / 29

Page 47: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Experimental Setup

Train/test splits:Microblog: train on 2011, 2012 and 2013, test on 2014Robust04: five-fold cross validation

Hyper-parameter tuning: 10% of the training dataModels:

Microblog: MatchZoo models, MP-HCNN, BERTRobust04: MatchZoo models

Wei Yang End-to-end Neural Information Retrieval 20 / 29

Page 48: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Results of Baselines: Robust04

Model AP P@20 NDCG@20QL (Guo et al.) 0.253 0.369 0.415

BM25 (Guo et al.) 0.255 0.370 0.418DRMM (Guo et al.) 0.279 0.382 0.431

MatchPyramid (Pang et al.) 0.232 0.327 0.411BM25 (Mcdonald et al.) 0.238 0.354 0.425

PACRR (Mcdonald et al.) 0.258 0.372 0.443

Table: Previous Results on the Robust04 dataset

QL QL+RM3 BM25 BM25+RM3AP 0.2465 0.2743 0.2515 0.3033

P@20 0.3508 0.3639 0.3612 0.3973NDCG@20 0.4109 0.4172 0.4225 0.4514

Table: Our results of retrieval models on the Robust04 dataset

Wei Yang End-to-end Neural Information Retrieval 21 / 29

Page 49: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Results of Baselines: Robust04

Model AP P@20 NDCG@20QL (Guo et al.) 0.253 0.369 0.415

BM25 (Guo et al.) 0.255 0.370 0.418DRMM (Guo et al.) 0.279 0.382 0.431

MatchPyramid (Pang et al.) 0.232 0.327 0.411BM25 (Mcdonald et al.) 0.238 0.354 0.425

PACRR (Mcdonald et al.) 0.258 0.372 0.443

Table: Previous Results on the Robust04 dataset

QL QL+RM3 BM25 BM25+RM3AP 0.2465 0.2743 0.2515 0.3033

P@20 0.3508 0.3639 0.3612 0.3973NDCG@20 0.4109 0.4172 0.4225 0.4514

Table: Our results of retrieval models on the Robust04 dataset

Wei Yang End-to-end Neural Information Retrieval 21 / 29

Page 50: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Results of End-to-end Neural IR Models: Robust04

Models MAP P@20 NDCG@20BM25+RM3 0.3033 0.3973 0.4514

DSSM 0.0982− 0.1331− 0.1551−CDSSM 0.0641− 0.0842− 0.0772−DRMM 0.2543− 0.3405− 0.4025−KNRM 0.1145− 0.1480− 0.1512−DUET 0.1426− 0.1561− 0.1946−

DSSM+RM3 0.3026 0.3946 0.4491CDSSM+RM3 0.2995 0.3944 0.4468DRMM+RM3 0.3151+ 0.4147+ 0.4717+

KNRM+RM3 0.3036 0.3928 0.4441DUET+RM3 0.3051 0.3986 0.4502

Table: Results of retrieval and reranking on the Robust04 dataset. RM: retrievalmodel. NRM: neural re-ranking model. Significant improvement or degradationwith respect to the retrieval model is indicated (+/-) (p-value ≤ 0.05).

Wei Yang End-to-end Neural Information Retrieval 22 / 29

Page 51: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Results of Baselines: Microblog

Method AP P@30QL (Rao et al.) 0.3924 0.6182

RM3 (Rao et al.) 0.4480 0.6339L2R (Rao et al.) 0.3943 0.6200

MP-HCNN (Rao et al.) 0.4409 0.6612BiCNN (Shi et al.) 0.4563 0.6806

Table: Previous Results on Microblog datasets

QL QL+RM3 BM25 BM25+RM3AP 0.4181 0.4676 0.3931 0.4374

P@30 0.6430 0.6533 0.6212 0.6442

Table: Our results of retrieval models on Microblog datasets

Wei Yang End-to-end Neural Information Retrieval 23 / 29

Page 52: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Results of Baselines: Microblog

Method AP P@30QL (Rao et al.) 0.3924 0.6182

RM3 (Rao et al.) 0.4480 0.6339L2R (Rao et al.) 0.3943 0.6200

MP-HCNN (Rao et al.) 0.4409 0.6612BiCNN (Shi et al.) 0.4563 0.6806

Table: Previous Results on Microblog datasets

QL QL+RM3 BM25 BM25+RM3AP 0.4181 0.4676 0.3931 0.4374

P@30 0.6430 0.6533 0.6212 0.6442

Table: Our results of retrieval models on Microblog datasets

Wei Yang End-to-end Neural Information Retrieval 23 / 29

Page 53: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Results of End-to-end Neural IR Models: Microblog

Models AP P@30QL+RM3 0.4676 0.6533

DSSM 0.2634− 0.3836−CDSSM 0.1936− 0.2636−DRMM 0.4477− 0.6127−KNRM 0.3432− 0.5121−DUET 0.2713− 0.3533−

MP-HCNN 0.4497 0.6219BERT 0.4646 0.6509

DSSM+RM3 0.4666 0.6539CDSSM+RM3 0.4703 0.6624DRMM+RM3 0.4862+ 0.6703KNRM+RM3 0.4848+ 0.6624DUET+RM3 0.4844+ 0.6594

MP-HCNN+RM3 0.4902+ 0.6712BERT+RM3 0.5011+ 0.6842+

Table: Results of retrieval and text matching on Microblog datasets. RM: retrievalmodel. NRM: neural re-ranking model. Significant improvement or degradationwith respect to the retrieval model is indicated (+/-) (p-value ≤ 0.05).

Wei Yang End-to-end Neural Information Retrieval 24 / 29

Page 54: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Per-topic Analysis: Microblog

181

224

192

172

193

173

178

179

187

212

183

197

201

196

177

190

206

209

182

204

208

207

214

191

213

195

171

203

216

219

225

221

205

199

174

215

176

198

184

200

189

223

180

188

220

202

185

194

211

186

222

217

218

175

210

Topics

0.40.30.20.10.00.10.20.30.4

map

Diff

Per-topic analysis on map

Figure: Per-topic differencesbetween BERT+RM3 and QL+RM3

181

179

193

224

192

172

201

178

208

197

204

183

203

190

182

213

177

171

216

191

207

209

206

212

199

196

176

189

195

215

223

174

205

214

200

221

173

198

194

219

180

184

220

186

225

217

218

222

185

188

187

175

211

210

202

Topics

0.4

0.2

0.0

0.2

0.4

map

Diff

Per-topic analysis on map

Figure: Per-topic differences between BERT and QL+RM3

Wei Yang End-to-end Neural Information Retrieval 25 / 29

Page 55: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Per-topic Analysis: Robust04

308

360

341

678

629

644

316

412

323

430

422

689

334

687

344

696

373

621

445

423

447

609

333

421

659

693

668

605

411

686

427

366

681

606

665

428

382

684

407

420

351

675

697

443

450

444

626

335

372

369

379

364

645

622

357

646

603

409

405

380

630

650

649

311

434

331

639

307

657

615

436

383

429

674

339

688

312

658

375

631

628

388

624

449

435

370

612

602

376

438

353

384

416

643

301

347

694

426

356

613

425

363

402

417

700

401

620

679

601

349

611

342

441

394

424

377

627

446

638

328

634

374

636

302

381

359

619

408

673

303

389

625

410

640

623

310

440

413

315

338

378

322

343

699

318

397

368

393

398

346

386

320

669

431

610

655

437

432

305

690

336

395

666

367

314

387

348

442

676

309

604

433

404

647

660

371

448

616

653

354

345

361

618

648

419

403

304

633

642

337

677

671

617

400

326

641

306

418

662

330

399

683

651

691

362

664

327

352

439

637

329

414

698

608

340

654

365

392

396

391

406

355

652

325

317

321

661

385

692

685

415

682

632

319

670

680

332

350

635

614

667

358

390

313

607

663

656

324

695

Topics0.4

0.2

0.0

0.2

0.4

0.6

map

Diff

Per-topic analysis on map

Figure: Per-topic differences between DRMM+RM3 and BM25+RM3

308

360

629

644

334

423

621

341

696

380

344

379

412

678

687

323

674

422

646

303

421

603

316

394

630

369

333

605

339

405

684

445

428

397

645

430

312

409

697

307

407

373

622

384

356

443

383

620

689

669

675

447

401

347

301

376

440

450

420

363

378

449

353

388

426

436

700

631

673

688

416

343

419

693

653

382

364

359

398

435

649

427

349

413

381

627

322

346

315

650

671

438

320

432

305

655

690

314

437

336

610

367

668

442

602

424

417

309

318

433

613

371

609

348

666

327

389

387

606

386

375

395

410

408

448

411

665

638

628

429

618

374

639

331

625

404

657

439

354

615

393

370

676

310

681

402

304

372

414

335

361

425

399

338

357

446

642

640

434

330

608

362

306

337

366

611

624

400

340

604

659

431

311

686

647

651

601

396

691

660

658

332

677

626

377

351

643

679

342

325

623

441

345

683

391

641

699

355

368

418

634

636

329

321

352

633

302

319

317

619

694

415

392

617

406

680

444

682

698

612

652

656

358

637

390

350

616

403

685

385

648

670

662

632

692

664

326

695

663

607

328

324

614

667

661

654

365

635

313

Topics

0.6

0.4

0.2

0.0

0.2

0.4

0.6

map

Diff

Per-topic analysis on map

Figure: Per-topic differences between DRMM and BM25+RM3

Wei Yang End-to-end Neural Information Retrieval 26 / 29

Page 56: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Sample Analysis: Microblog

Category Percentage (%)

Exact word match 100Exact phrase match 41Partial paraphrase match 64Partial URL match 24

Table: Matching evidence breakdown by category based on manual analysis of thetop 100 tweets for the five best-performing topics with MP-HCNN on theMicroblog dataset.

Wei Yang End-to-end Neural Information Retrieval 27 / 29

Page 57: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Table of Contents

1 Introduction

2 Related Work

3 End-to-end Neural Information Retrieval Architecture

4 Experiments

5 Conclusion and Discussion

Wei Yang End-to-end Neural Information Retrieval 28 / 29

Page 58: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Conclusion and Discussion

4, 7, and 2SOTADiscussion:

relevance v.s. similarityexact matching v.s. semantic matchingeffectiveness v.s. efficiencyexternal knowledge v.s. domain-specific design

Wei Yang End-to-end Neural Information Retrieval 29 / 29

Page 59: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Conclusion and Discussion

4, 7, and 2SOTADiscussion:

relevance v.s. similarityexact matching v.s. semantic matchingeffectiveness v.s. efficiencyexternal knowledge v.s. domain-specific design

Wei Yang End-to-end Neural Information Retrieval 29 / 29

Page 60: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Conclusion and Discussion

4, 7, and 2SOTADiscussion:

relevance v.s. similarityexact matching v.s. semantic matchingeffectiveness v.s. efficiencyexternal knowledge v.s. domain-specific design

Wei Yang End-to-end Neural Information Retrieval 29 / 29

Page 61: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Conclusion and Discussion

4, 7, and 2SOTADiscussion:

relevance v.s. similarityexact matching v.s. semantic matchingeffectiveness v.s. efficiencyexternal knowledge v.s. domain-specific design

Wei Yang End-to-end Neural Information Retrieval 29 / 29

Page 62: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Conclusion and Discussion

4, 7, and 2SOTADiscussion:

relevance v.s. similarityexact matching v.s. semantic matchingeffectiveness v.s. efficiencyexternal knowledge v.s. domain-specific design

Wei Yang End-to-end Neural Information Retrieval 29 / 29

Page 63: End-to-end Neural Information Retrieval · Table of Contents 1 Introduction 2 Related Work 3 End-to-end Neural Information Retrieval Architecture 4 Experiments 5 Conclusion and Discussion

Conclusion and Discussion

4, 7, and 2SOTADiscussion:

relevance v.s. similarityexact matching v.s. semantic matchingeffectiveness v.s. efficiencyexternal knowledge v.s. domain-specific design

Wei Yang End-to-end Neural Information Retrieval 29 / 29