scentbar: a query suggestion interface visualizing the amount of missed relevant information for...

33
ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search Kazutoshi Umemoto, Takehiro Yamamoto, Katsumi Tanaka Kyoto University, Japan {umemoto,tyamamot,tanaka}@dl.kuis.kyoto-u.ac.jp

Upload: kazutoshi-umemoto

Post on 11-Apr-2017

92 views

Category:

Presentations & Public Speaking


0 download

TRANSCRIPT

ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search

Kazutoshi Umemoto, Takehiro Yamamoto, Katsumi Tanaka

Kyoto University, Japan{umemoto,tyamamot,tanaka}@dl.kuis.kyoto-u.ac.jp

2Intrinsically Diverse Search Tasks [1]

A user searches for extensive information covering diverse aspects of a single topic

[1] K. Raman et al. Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search (2013)

What ifI continue smoking?

relax-ation stress

releaselose

weight

skin aging dental

health

waste of money

high blood

pressure

cancer risksmoking effect

smoking cancer

cigarette price

Multiple queries are issued to fulfill intrinsically diverse search tasks

3Decisions in Intrinsically Diverse Tasks

smoking effect

What ifI continue smoking?

!

smoking cancer

Should Istop examining

the SERP?

What queryshould I

issue next?

Should Istop the task

session?

Querystopping

Queryselection

Sessionstopping

4Query Stopping

smoking effect

When to stop examining the SERP of the current query?

rel

non-rel

non-rel

non-rel

smoking effect

very-rel

non-rel

non-rel

non-rel

Stop

Stop

Many wasted clicks Miss important docs

5Query SelectionWhich query to use for the next search?

ineffective query effective query

smoking effect

browsed browsed newbrowsed

Many already-browsed docsnew new browsednew

Many novel docs

Littlesearch effort

Muchsearch effort

6Session StoppingWhen to stop the whole task session?

Low additional outcomes High additional outcomes"

!

"

!

current current

Continue searching

Time-wasting

Stop searching

Miss important info

7Issues Raised by Inappropriate Decisions

l Searchers do not know: what aspects exist and how important they are?Difficult to formulate queries effective for covering diverse aspects

l Searchers cannot guess: how much important info are (un)explored?Difficult to find appropriate timing for query/session stopping

Why is it difficult to make rational decisions?Q.A.

Query stopping: increase wasted clicks or miss important docsQuery selection: increase in #queriesSession stopping: waste time or miss important info

Help searchers make better decisions onquery stopping, query selection, and session stopping

Objective

A.

8

important info that the searcher misses collecting from the SERP

Visualize the amount of missed information for each query

ScentBar: Our Proposed Query Suggestion Interface

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer risk

Bar length = amount of MI

9

important info that the searcher misses collecting from the SERP

Visualize the amount of missed information for each query

ScentBar: Our Proposed Query Suggestion Interface

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer risk

After browsing some docs through task session

How much info is available from the SERP?How much

remains unexplored?

10How It WorksWhen a user starts a intrinsically diverse search task…

11How It WorksWhen a user is inputting a search query…

12How It WorksWhen a user issues a search query…

13How It WorksWhen a user clicks a search result…

14How It WorksWhen a user finishes reading the landing page…

15How It WorksWhen a user finishes reading the landing page…

16Expected Benefits

Cancer seems more important

than price.

Little MI remains. Let’s change the

query.

Having covered most aspects, I can

finish the task.

!

At beginning of search task During each search At end of task session

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer risk

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer risk

cigars vs. cigarettes

smoking ruins your looks

smoking effects on brain

diseases caused by smoking

smoking benefits

Understandinfo distribution

Query stoppingQuery selection Session stopping

17Formalization: Missed Information

MI%,' ( = Gain' .% ∪ .01 − Gain' .%

Additional gain that can be obtained from unclicked search results

Gain obtained so farTotal gain obtainedafter browsing all SERP docs

3: search topic, .01: top 4 docs retrieved for query (, .%: set of docs browsed by user 5

How should missed information behave?

Formalize Gain to satisfy these three properties

Q.It should be highwhen SERP documentsA.

1. cover important aspects

2. are highly relevant

3. cover unexplored aspects

18

Gain' . = 6 Pr 9 3 : Gain; .

;∈>?

3: search topic, .and.B: document sets

Expected value of per-aspect gainaspect importance

a set of aspects

Gain; . =6Rel; FG : Disc; FK, … , FGMK

N

GOK

Sum up document relevance weighted by aspect novelty

per-aspectdocument relevance

discount function ofaspect novelty

Formalization: Gain

Disc; .B =P 1− Rel; FR

B�

STU∈NU

Discount aspects that have mostly been covered by browsed docs

Importance

Relevance

Novelty

19Relation with Intent Aware Metrics

[1] O. Chapelle et al. Expected Reciprocal Rank for Graded Relevance (2009)

Gain' . = 6 Pr 9 3 6Rel; FG P 1− Rel; FR

GMK

ROK

N

GOK

;∈>?

Gain

ERR−IA X0 = 6 Pr 9 ( 6Rel; YG

ZP 1− Rel; YR

GMK

ROK

N

GOK

;∈>[

ERR-IA [1]

discounts for lower-ranked docs

similar to

Gain is a set-wise metric for evaluating the utility of documentsseparately from the browsing order

20How to Estimate Missed Information

Need to know values of 3 gain components1. ]!, a set of aspects for search topic 3

2. _` a ! , an importance probability of aspect 9 ∈ b'3. defa g , a relevance score of document F to aspect 9

Estimate these components by subtopic mining algorithm [1]

[1] K. Tsukuda et al. Estimating Intent Types for Search Result Diversification (2013)

21User Study

l 24 subjects (within-subject design)

l 4 search topics (from NTCIR INTENT, IMine)▶ symptoms of diabetes▶ clinical depression▶ dress codes for wedding ceremony▶ dinosaurs

l ClueWeb09-JA as document corpus

RQs

without MI (baseline)

with MI (proposed)

How our visualization ofmissed information affects

1. Query Stopping2. Query Selection3. Gain Acquisition Pattern4. Session Stopping5. Effort vs. Gain

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer risk

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer riskmedical domain

22Experimental Procedure

There is no time limit for each topic▶ i.e. Different participants had different task completion times▶ To investigate the effect of ScentBar on session stopping▶ (All participants completed the whole experiment within 2 hours)

You were given the assignment of submitting a thorough report on topic 3.To fully understand 3 , collect relevant information on this topic from anumber of different aspects that you think is important. You may end thissearch task when you feel there is little important information left.

Task guideline

< 2 hoursStart

Instruction Finishtopic 3K with UI ZK

Instruction topic 3i with UI Zi Finish

23Accuracy of Gain EstimationMeasure correlation between estimated/oracle gains at each browsing path

3

Start task Finish taskBrowse doc FK Browse Fi

oracle:

estimated:Gain'∗ FK

Gain'k FK

Gain'∗ FK, Fi

Gain'k FK, Fi

Topic Pearson’s l Spearman’s m Kendall’s n

symptoms ofdiabetes 0.834 0.851 0.683

clinical depression 0.845 0.860 0.678dress codes for

wedding ceremony 0.824 0.862 0.713

dinosaurs 0.710 0.702 0.529

HighCorrelation

LowCorrelation

⋯how consistent?

Conduct analyses for All, HC, and LC topics separatelyto investigate effect of gain estimation accuracy

24Analysis 1: Query Stopping

Calculate how much oracle MI decreases when the current search ends

ScentBar users collected more relevant information at query level

3

Start task Issue query ( Finish taskStop searching with (

smoking cancer risk

All topics HC topics LC topic

baseline proposed baseline proposed baseline proposed

pqr 0.109 0.138* 0.128 0.186* 0.069 0.071

RQ How does ScentBar affect users’ decisions on when to stop the current search?

HC topics

25Analysis 2: Query Selection

Calculate how much MI remains when the new query is issued

ScentBar users issued queries having more missed information

smoking cancer risk

All topics HC topics LC topic

baseline proposed baseline proposed baseline proposed

qr 0.211 0.241* 0.238 0.298* 0.155 0.162

RQ How does ScentBar affect users’ decisions on which query to use for the next search?

3

Start task Issue query ( Finish taskStop searching with (

HC topics

26Analysis 3: Gain Acquisition Pattern

All topics HC topics LC topic

l Little difference between interfaces at early stage (! < stuv)l ScentBar users obtained higher oracle gain at late stage

Calculate cumulative oracle gain that participants obtained by the minute

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40

OracleGa

in

ElapsedTime

RQ How does ScentBar affect the temporal change in gain that users acquire through their search process?

HC topics

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40w/oscentbaseline

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40w/scentproposed

27Analysis 4: Session Stopping

Calculate how much oracle MI decreases when the whole session ends

RQ How does ScentBar affect users’ decisions on when to stop their task sessions?

3

Start task Issue query ( Finish taskStop searching with (

smoking cancer risk

All topics HC topics LC topic

baseline proposed baseline proposed baseline proposed

pqr 0.162 0.195* 0.188 0.256* 0.106 0.110

ScentBar users collected more relevant information at session level, too

HC topics

28

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40w/oscent w/scent

Model task completion time and oracle gain via linear regression

Analysis 5: Effort vs. Gain

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40

OracleGa

in

TaskCompletionTime

RQ How does ScentBar affect the relationship between the effort that users expend and the gain that they obtain?

All topics HC topics LC topic

ScentBar users obtained higher oracle gain per unit time

HC topics

baseline proposed

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40w/oscent w/scentbaseline proposed

29Search Behavior

l ScentBar users issued more suggestion queries for HC topicsl No significant difference in the number of wasted clicks

prior to query stopping

All topics HC topics LC topic

baseline proposed baseline proposed baseline proposed

%xyzze{|

}ye`ue{0.376 0.480 0.355 0.505* 0.437 0.405

#�fuÄÅ{

ÇÉ{|def−} 0.156 0.164 0.165 0.175 0.133 0.142

%xyzze{|}ye`ue{:the fraction of suggestion queries among all issued queries

#�fuÄÅ{ÇÉ{|def−}:the number of irrelevant doc clicks prior to query stopping

30Findings

l Effects of MI visualization on search outcomes▶ Highly-accurate MI enabled users to

1. issue more promising queries2. obtain higher gain at the late stage of sessions3. obtain higher gain per unit time

▶ Less-accurate MI worsened search performanceprobably because users got confused with unreliable MI behavior

l Effects of MI visualization on search behavior▶ Users interacted with query suggestions more frequently▶ Users clicked more irrelevant docs just before query stopping

(though not statistically significant)

31Discussion

Why did MI visualization fail to reduce wasted clicks prior to query stopping?Q.Query-level MI visualization might be less informativefor making decisions to stop SERP examinationA. Few relevant results

More queries

smoking cancer riskl Which doc should I assess next?l Is the examination worth the effort?

Though some MI remains in this SERP, …

Improvement of search outcomes arel mainly due to better decisions on query selectionl not due to better decisions on query stopping

32How to Better Support Query Stopping?

Possible solutions▶ Visualize per-aspect relevance for each search result [1],

as well as query-level MI presentation▶ Re-rank search results with much MI at high positions,

so that optimal stopping points can be found by searcherswho investigate SERPs in a top-down manner

▶ Make users aware of search effort [2] they have expended so far,to help searchers understand their cost-benefit performance

[1] M. Iwata et al. AspecTiles: Tile-based Visualization of Diversified Web Search Results (2012)[2] L. Azzopardi and G. Zuccon. An Analysis of Theories of Search and Search Behavior (2015)

These can be integrated into ScentBar

33

ScentBar: query suggestion interface for intrinsically diverse task

Summary: Visualization of Missed Information

cigarettes price increase

smoking ruins your looks

smoking benefits

diseases caused by smoking

smoking cancer risk Much info is still unexplored.I have to keep on searching.

Missed Information

Findings

Future Work

l Conceptualized as important info that the searcher misses collecting from the SERPl Formalized as additional gain that can be obtained from unclicked search results

l ScentBar helped users make better decisions especially on query selectionand made search process more efficient when high-accurate MI was presented

l Search performance was worsened when less-accurate MI was presented

l Improve gain estimation algorithm (e.g. by modeling topic aspects hierarchically)l Utilize missed information in different ways (e.g. MI-based query suggestion algorithm)