exploring session search
DESCRIPTION
Slides from my presentation at the ECIR 2012 workshop on "Information Retrieval Over Query Sessions" (SIR2012) held in Barcelona, Spain. Title: Exploring Session Search Abstract: Exploratory search is typically characterized by recall-oriented information needs and by uncertainty and evolution of the information need. As searchers interact with the system, their understanding of the topic evolves in response to found information. These two characteristics – uncertainty of information need and the desire to find multiple documents – drive the need to run multiple queries. Furthermore, these queries are not independent of each other because they often retrieve overlapping sets of documents. Yet traditional information retrieval systems often treat searchers’ queries in isolation, ignoring the evolution of a person’s understanding of the information need and the historical coupling among queries. I this talk, I will describe some interface ideas we're exploring to help people incorporate their search history into their ongoing retrieval and sense-making tasks, and will touch on some issues related to retrieval algorithms and evaluation.TRANSCRIPT
![Page 1: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/1.jpg)
Exploring Session Search
Gene Golovchinsky
FX Palo Alto Laboratory, Inc.
@HCIR_GeneG
![Page 2: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/2.jpg)
Jeremy Pickens, Abdigani Diriye, Tony Dunnigan
Thanks to:
![Page 3: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/3.jpg)
Exploratory search
Interactive
Information seeking
Anomalous state of knowledge
Evolving information need
Often recall-oriented
![Page 4: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/4.jpg)
One Query to Rule Them All
No single query satisfies a typical exploratory search information need
Search strategies involve many queries
Queries return overlapping results
![Page 5: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/5.jpg)
Why we’re here
1. How do we know what’s a session?
2. How do we help people deal with this complex task?
3. How do we evaluate systems and algorithms?
![Page 6: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/6.jpg)
THIS TALK CONTAINS EXPLICIT CONTENT
Warning
![Page 7: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/7.jpg)
Explicit vs. implicit sessions
Explicit sessions 1. We ask the person
2. We infer it from structural aspects
of the search context Task context may provide strong organizing queues For example, genealogical searches are often tied to a person in a family tree
What about implicit sessions?
![Page 8: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/8.jpg)
Implicit section detection is based on implicit assumptions
How do we detect a session? – Time heuristics
– Client connection heuristics
– Query similarity heuristics
What are we assuming? – Person works continuously
– Person does not switch tasks
– Enough overlap in queries
How good are these assumptions?
![Page 9: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/9.jpg)
Tradeoffs
Implicit sessions
Pros No explicit user input required
Cons Effectiveness relies on precision-oriented information needs and inter-query similarity, i.e., on redundancy
More difficult to connect recurring or ongoing instances of the same information need
Explicit sessions
Pros Accurate
Needed for collaboration
Durable over time
Cons Requires manual input in some cases
![Page 10: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/10.jpg)
Dealing with redundancy
![Page 11: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/11.jpg)
Strategies
Ignore it The traditional approach
Manage redundancy in the UI Ancestry.com, Querium
Increase diversity through scoring Some algorithmic evaluation,
but are such interactive systems deployed?
![Page 12: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/12.jpg)
COPING WITH REDUNDANCY Manage redundancy in the UI
![Page 13: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/13.jpg)
Some UI examples
Google +1 but no session awareness & no good persistent visual feedback
Bing Visible query history but no help with documents
Ancestry.com Flags previously saved records for current person
Querium user interface Variety of document- and query-centric displays
![Page 14: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/14.jpg)
Ancestry.com: Query overlap
How can we help people make sense of search results?
What’s new?
What’s redundant?
What’s useful?
What’s not useful?
![Page 15: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/15.jpg)
Querium: Filtering by process metadata
History of interaction during a search can be projected onto current results
![Page 16: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/16.jpg)
Querium: Visualizing re-retrieval
Document-centered retrieval history can be projected onto each search result
Indicates “important” documents
Indicates new documents
![Page 17: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/17.jpg)
Querium: Query-centric view
![Page 18: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/18.jpg)
Querium: Query-centric view
![Page 19: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/19.jpg)
Query-centric view
![Page 20: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/20.jpg)
PREVENTING REDUNDANCY Increasing diversity
![Page 21: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/21.jpg)
Some (cor)related metrics
Diversity
Redundancy
Novelty
Precision
Recall
The exact relationship is hard to pin down
![Page 22: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/22.jpg)
Pros – Can incorporate prior explicit and
implicit relevance assessments
– More focused queries may retrieve more pertinent documents at a given cutoff
Cons – Relies on accurate assessment of
relevance
– No way to recover “organic” results, so hard for people to understand effect of personalization
Black box
Increasing {diversity} with scoring
Query
Rank docs
Session state
Displayed ranking
User feedback
Stop
![Page 23: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/23.jpg)
Increasing {diversity} with post-processing
Pros – Can recover “organic” results
– Supports feedback on incorrect inference
If user selects demoted doc
– Accommodates shifting info needs better
– Can be applied interactively
Cons – Limited document set
Query Rank docs
“organic” ranking
Session state
Re-rank docs
Displayed ranking
User feedback
Stop
![Page 24: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/24.jpg)
EVALUATION A holistic approach
![Page 25: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/25.jpg)
Vague generalities
Session-based search must be evaluated as a human-machine system Hard to account for real human behavior through simulations only
Recall and precision do not tell the whole story Exploratory search is inherently a learning process
Effort, knowledge gain, frustration, serendipity important
Look at patterns of interaction that led to discovery Hard to evaluate marginal contribution of each query due to
negative results, learning, information need drift
![Page 26: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/26.jpg)
Some thoughts on evaluating algorithms
Small gains in retrieval effectiveness will be swamped by interaction, good or bad
Small statistically-significant effects are meaningless in practice
Evaluation “in the wild” relies on users for ground truth Use post-hoc analysis to test how algorithms predicted users’ choices
Look at system’s ability to help people recognize useful documents How many times was a document retrieved before it was seen?
This works for lab and naturalistic studies
![Page 27: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/27.jpg)
In closing…
Information needs evolve
Queries are approximations
Knowledge is uncertain
Design challenge: Help people plan future actions by understanding the
present in the context of the past
![Page 28: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/28.jpg)
While I have your attention…
There is a pending proposal to create a StackExchange site for information retrieval.
Think of it as Stack Overflow for IR geeks.
We need more people to vote & promote.
http://area51.stackexchange.com/proposals/39142/information-retrieval-and-search-engines
![Page 29: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/29.jpg)
Do I still have your attention?
IIiX 2012 August 21-24, 2012, Nijmegen, The Netherlands
Deadline for papers April 9, 2012
EuroHCIR 2012 Same place, August 25 Deadline for papers is June 22, 2012
HCIR 2012: The 6th Symposium on Human Computer Information Retrieval October 4-5, 2012, Boston, Massachusetts, USA Submission deadline mid-summer Will publish works in progress and archival, full-length papers
![Page 30: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/30.jpg)
Image credits
http://www.flickr.com/photos/torremountain/6831414535/
http://www.flickr.com/photos/bigtallguy/233176326/
http://www.flickr.com/photos/77074420@N00/198347900/
http://www.flickr.com/photos/racatumba/93569705/
http://www.flickr.com/photos/chrisolson/3595815374/
http://www.flickr.com/photos/brymo/2813028454/
http://www.flickr.com/photos/computix/108732248/
http://www.flickr.com/photos/funadium/913303959/
http://www.flickr.com/photos/moriza/189890016/
http://www.flickr.com/photos/uhdigital/6802789537/
![Page 31: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/31.jpg)
![Page 32: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/32.jpg)
Hiding unwanted results
![Page 33: Exploring session search](https://reader035.vdocuments.site/reader035/viewer/2022070304/54b900724a79596a218b457a/html5/thumbnails/33.jpg)
Hiding unwanted results