scentbar: a query suggestion interface visualizing the amount of missed relevant information for...
TRANSCRIPT
ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search
Kazutoshi Umemoto, Takehiro Yamamoto, Katsumi Tanaka
Kyoto University, Japan{umemoto,tyamamot,tanaka}@dl.kuis.kyoto-u.ac.jp
2Intrinsically Diverse Search Tasks [1]
A user searches for extensive information covering diverse aspects of a single topic
[1] K. Raman et al. Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search (2013)
What ifI continue smoking?
relax-ation stress
releaselose
weight
skin aging dental
health
waste of money
high blood
pressure
cancer risksmoking effect
smoking cancer
cigarette price
Multiple queries are issued to fulfill intrinsically diverse search tasks
3Decisions in Intrinsically Diverse Tasks
smoking effect
What ifI continue smoking?
!
smoking cancer
Should Istop examining
the SERP?
What queryshould I
issue next?
Should Istop the task
session?
Querystopping
Queryselection
Sessionstopping
4Query Stopping
smoking effect
When to stop examining the SERP of the current query?
rel
non-rel
non-rel
non-rel
smoking effect
very-rel
non-rel
non-rel
non-rel
Stop
Stop
Many wasted clicks Miss important docs
5Query SelectionWhich query to use for the next search?
ineffective query effective query
smoking effect
browsed browsed newbrowsed
Many already-browsed docsnew new browsednew
Many novel docs
Littlesearch effort
Muchsearch effort
6Session StoppingWhen to stop the whole task session?
Low additional outcomes High additional outcomes"
!
"
!
current current
Continue searching
Time-wasting
Stop searching
Miss important info
7Issues Raised by Inappropriate Decisions
l Searchers do not know: what aspects exist and how important they are?Difficult to formulate queries effective for covering diverse aspects
l Searchers cannot guess: how much important info are (un)explored?Difficult to find appropriate timing for query/session stopping
Why is it difficult to make rational decisions?Q.A.
Query stopping: increase wasted clicks or miss important docsQuery selection: increase in #queriesSession stopping: waste time or miss important info
Help searchers make better decisions onquery stopping, query selection, and session stopping
Objective
A.
8
important info that the searcher misses collecting from the SERP
Visualize the amount of missed information for each query
ScentBar: Our Proposed Query Suggestion Interface
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
Bar length = amount of MI
9
important info that the searcher misses collecting from the SERP
Visualize the amount of missed information for each query
ScentBar: Our Proposed Query Suggestion Interface
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
After browsing some docs through task session
How much info is available from the SERP?How much
remains unexplored?
16Expected Benefits
Cancer seems more important
than price.
Little MI remains. Let’s change the
query.
Having covered most aspects, I can
finish the task.
!
At beginning of search task During each search At end of task session
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
cigars vs. cigarettes
smoking ruins your looks
smoking effects on brain
diseases caused by smoking
smoking benefits
Understandinfo distribution
Query stoppingQuery selection Session stopping
17Formalization: Missed Information
MI%,' ( = Gain' .% ∪ .01 − Gain' .%
Additional gain that can be obtained from unclicked search results
Gain obtained so farTotal gain obtainedafter browsing all SERP docs
3: search topic, .01: top 4 docs retrieved for query (, .%: set of docs browsed by user 5
How should missed information behave?
Formalize Gain to satisfy these three properties
Q.It should be highwhen SERP documentsA.
1. cover important aspects
2. are highly relevant
3. cover unexplored aspects
18
Gain' . = 6 Pr 9 3 : Gain; .
�
;∈>?
3: search topic, .and.B: document sets
Expected value of per-aspect gainaspect importance
a set of aspects
Gain; . =6Rel; FG : Disc; FK, … , FGMK
N
GOK
Sum up document relevance weighted by aspect novelty
per-aspectdocument relevance
discount function ofaspect novelty
Formalization: Gain
Disc; .B =P 1− Rel; FR
B�
STU∈NU
Discount aspects that have mostly been covered by browsed docs
Importance
Relevance
Novelty
19Relation with Intent Aware Metrics
[1] O. Chapelle et al. Expected Reciprocal Rank for Graded Relevance (2009)
Gain' . = 6 Pr 9 3 6Rel; FG P 1− Rel; FR
GMK
ROK
N
GOK
�
;∈>?
Gain
ERR−IA X0 = 6 Pr 9 ( 6Rel; YG
ZP 1− Rel; YR
GMK
ROK
N
GOK
�
;∈>[
ERR-IA [1]
discounts for lower-ranked docs
similar to
Gain is a set-wise metric for evaluating the utility of documentsseparately from the browsing order
20How to Estimate Missed Information
Need to know values of 3 gain components1. ]!, a set of aspects for search topic 3
2. _` a ! , an importance probability of aspect 9 ∈ b'3. defa g , a relevance score of document F to aspect 9
Estimate these components by subtopic mining algorithm [1]
[1] K. Tsukuda et al. Estimating Intent Types for Search Result Diversification (2013)
21User Study
l 24 subjects (within-subject design)
l 4 search topics (from NTCIR INTENT, IMine)▶ symptoms of diabetes▶ clinical depression▶ dress codes for wedding ceremony▶ dinosaurs
l ClueWeb09-JA as document corpus
RQs
without MI (baseline)
with MI (proposed)
How our visualization ofmissed information affects
1. Query Stopping2. Query Selection3. Gain Acquisition Pattern4. Session Stopping5. Effort vs. Gain
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer riskmedical domain
22Experimental Procedure
There is no time limit for each topic▶ i.e. Different participants had different task completion times▶ To investigate the effect of ScentBar on session stopping▶ (All participants completed the whole experiment within 2 hours)
You were given the assignment of submitting a thorough report on topic 3.To fully understand 3 , collect relevant information on this topic from anumber of different aspects that you think is important. You may end thissearch task when you feel there is little important information left.
Task guideline
< 2 hoursStart
Instruction Finishtopic 3K with UI ZK
Instruction topic 3i with UI Zi Finish
23Accuracy of Gain EstimationMeasure correlation between estimated/oracle gains at each browsing path
3
Start task Finish taskBrowse doc FK Browse Fi
oracle:
estimated:Gain'∗ FK
Gain'k FK
Gain'∗ FK, Fi
Gain'k FK, Fi
Topic Pearson’s l Spearman’s m Kendall’s n
symptoms ofdiabetes 0.834 0.851 0.683
clinical depression 0.845 0.860 0.678dress codes for
wedding ceremony 0.824 0.862 0.713
dinosaurs 0.710 0.702 0.529
HighCorrelation
LowCorrelation
⋯
⋯how consistent?
Conduct analyses for All, HC, and LC topics separatelyto investigate effect of gain estimation accuracy
24Analysis 1: Query Stopping
Calculate how much oracle MI decreases when the current search ends
ScentBar users collected more relevant information at query level
3
Start task Issue query ( Finish taskStop searching with (
smoking cancer risk
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
pqr 0.109 0.138* 0.128 0.186* 0.069 0.071
RQ How does ScentBar affect users’ decisions on when to stop the current search?
HC topics
25Analysis 2: Query Selection
Calculate how much MI remains when the new query is issued
ScentBar users issued queries having more missed information
smoking cancer risk
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
qr 0.211 0.241* 0.238 0.298* 0.155 0.162
RQ How does ScentBar affect users’ decisions on which query to use for the next search?
3
Start task Issue query ( Finish taskStop searching with (
HC topics
26Analysis 3: Gain Acquisition Pattern
All topics HC topics LC topic
l Little difference between interfaces at early stage (! < stuv)l ScentBar users obtained higher oracle gain at late stage
Calculate cumulative oracle gain that participants obtained by the minute
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
OracleGa
in
ElapsedTime
RQ How does ScentBar affect the temporal change in gain that users acquire through their search process?
HC topics
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40w/oscentbaseline
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40w/scentproposed
27Analysis 4: Session Stopping
Calculate how much oracle MI decreases when the whole session ends
RQ How does ScentBar affect users’ decisions on when to stop their task sessions?
3
Start task Issue query ( Finish taskStop searching with (
smoking cancer risk
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
pqr 0.162 0.195* 0.188 0.256* 0.106 0.110
ScentBar users collected more relevant information at session level, too
HC topics
28
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40w/oscent w/scent
Model task completion time and oracle gain via linear regression
Analysis 5: Effort vs. Gain
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
OracleGa
in
TaskCompletionTime
RQ How does ScentBar affect the relationship between the effort that users expend and the gain that they obtain?
All topics HC topics LC topic
ScentBar users obtained higher oracle gain per unit time
HC topics
baseline proposed
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40w/oscent w/scentbaseline proposed
29Search Behavior
l ScentBar users issued more suggestion queries for HC topicsl No significant difference in the number of wasted clicks
prior to query stopping
All topics HC topics LC topic
baseline proposed baseline proposed baseline proposed
%xyzze{|
}ye`ue{0.376 0.480 0.355 0.505* 0.437 0.405
#�fuÄÅ{
ÇÉ{|def−} 0.156 0.164 0.165 0.175 0.133 0.142
%xyzze{|}ye`ue{:the fraction of suggestion queries among all issued queries
#�fuÄÅ{ÇÉ{|def−}:the number of irrelevant doc clicks prior to query stopping
30Findings
l Effects of MI visualization on search outcomes▶ Highly-accurate MI enabled users to
1. issue more promising queries2. obtain higher gain at the late stage of sessions3. obtain higher gain per unit time
▶ Less-accurate MI worsened search performanceprobably because users got confused with unreliable MI behavior
l Effects of MI visualization on search behavior▶ Users interacted with query suggestions more frequently▶ Users clicked more irrelevant docs just before query stopping
(though not statistically significant)
31Discussion
Why did MI visualization fail to reduce wasted clicks prior to query stopping?Q.Query-level MI visualization might be less informativefor making decisions to stop SERP examinationA. Few relevant results
More queries
smoking cancer riskl Which doc should I assess next?l Is the examination worth the effort?
Though some MI remains in this SERP, …
Improvement of search outcomes arel mainly due to better decisions on query selectionl not due to better decisions on query stopping
32How to Better Support Query Stopping?
Possible solutions▶ Visualize per-aspect relevance for each search result [1],
as well as query-level MI presentation▶ Re-rank search results with much MI at high positions,
so that optimal stopping points can be found by searcherswho investigate SERPs in a top-down manner
▶ Make users aware of search effort [2] they have expended so far,to help searchers understand their cost-benefit performance
[1] M. Iwata et al. AspecTiles: Tile-based Visualization of Diversified Web Search Results (2012)[2] L. Azzopardi and G. Zuccon. An Analysis of Theories of Search and Search Behavior (2015)
These can be integrated into ScentBar
33
ScentBar: query suggestion interface for intrinsically diverse task
Summary: Visualization of Missed Information
cigarettes price increase
smoking ruins your looks
smoking benefits
diseases caused by smoking
smoking cancer risk Much info is still unexplored.I have to keep on searching.
Missed Information
Findings
Future Work
l Conceptualized as important info that the searcher misses collecting from the SERPl Formalized as additional gain that can be obtained from unclicked search results
l ScentBar helped users make better decisions especially on query selectionand made search process more efficient when high-accurate MI was presented
l Search performance was worsened when less-accurate MI was presented
l Improve gain estimation algorithm (e.g. by modeling topic aspects hierarchically)l Utilize missed information in different ways (e.g. MI-based query suggestion algorithm)