special topics in computer science advanced topics in information retrieval chapter 1: introduction...
TRANSCRIPT
![Page 1: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/1.jpg)
Special Topics in Computer ScienceSpecial Topics in Computer Science
Advanced Topics in Information RetrievalAdvanced Topics in Information Retrieval
Chapter 1: IntroductionChapter 1: Introduction
Alexander Gelbukh
www.Gelbukh.com
![Page 2: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/2.jpg)
2
MotivationMotivation
First for libraries, but now — WWW!!! Info: representation, storage, organization, access Search Engines (IR systems) User information need
o Plain English description query
Concerns of modern IR:o modeling
o classification, categorization, filtering
o system architecture
o user interfaces, visualization, query languages
![Page 3: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/3.jpg)
3
Data vs. Information RetrievalData vs. Information Retrieval
Data Retrieval Precise description Well-structured data
Precise results Yes-or-no results
Science
Information Retrieval Vague information need Natural Language, images, ... Semantic interpretation Approximate results Relevance ranking
Art!
![Page 4: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/4.jpg)
4
Basic ConceptsBasic Concepts
User task (search)o Can formulate what they need: Retrieval (classical)o Can’t (or does not know): Browsing (new to IR)
Still not very well integrated
o Filtering (user passive, contents active) Logical view of docs
o ... Added linguistic info... not clear if helpso Full texto Text operations: reduce complexity to index terms
Keywords, stopwords Stemming, noun groups (linguistic processing needed)
o Categories
Slow, good
Fast, bad
![Page 5: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/5.jpg)
5
Past, Present, and FuturePast, Present, and Future
Since clay tabletso Alphabetical index (formal)o Table of Contents (by storing order)o Classifications (by meaning)
Librarieso Automation of classical techniques. Catalogs.o Search by fields (exact match: author, title, keywords)
Web & Digital Libraries: interactiveo Cheaper huge amount of datao Networks remote access, wider audienceo Free publishing unprepared, heterogeneous data
Artificial Intelligence and Linguistic methods
![Page 6: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/6.jpg)
6
Main concernsMain concerns
Open audienceo Help people to formulate their information need
o Improve retrieval quality. Intelligent methods
Efficiency (speed)o Development of fast techniques
Interactiono Watch user behavior to improve quality
o Privacy!
Open contento Legal issues. Copyright. Responsibility for info quality
o Intelligent methods
![Page 7: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/7.jpg)
7
Retrieval processRetrieval process
Databaseo Define the logical view: text operations, text model
Index (e.g., inverted file)
User queryo Query operations (users are not good at this!)
Retrieved docso Ranked by likelihood (relevance)
Feedback cycle
![Page 8: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/8.jpg)
![Page 9: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/9.jpg)
9
The Textbook: Text IRThe Textbook: Text IR
Models and Evaluationo Modeling (basic concepts)o Retrieval Evaluation
Improvements on Retrievalo Query Languageso Query Operations o Text Languages and Properties o Text Operations
Efficiencyo Indexing and Searching
![Page 10: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/10.jpg)
10
Conferences & JournalsConferences & Journals
Confs on IRo IRo ACM SIGIRo TRECo SPIRE
Journalo IR
General conferences on text processingo ACLo COLINGo CICLingo DEXA (databases)o NLDB
![Page 11: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/11.jpg)
11
ConclusionsConclusions
User Information Needo Vague
o Semantic, not formal
Document Relevanceo Order, not retrieve
Huge amount of informationo Efficiency concerns
o Tradeoffs
IR is art more than science
![Page 12: Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh](https://reader036.vdocuments.site/reader036/viewer/2022062511/5514779c550346b2598b4608/html5/thumbnails/12.jpg)
12
Thank you!