struggling or exploring? disambiguating long search sessions ahmed hassan, ryen white, susan dumais...

Post on 14-Jan-2016

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Struggling or Exploring? Disambiguating Long Search Sessions

Ahmed Hassan, Ryen White, Susan Dumais and Yi-Min Wang

Web Search Query Taxonomy (Broder, 2002)

Navigational

Informational queries: The purpose of such queries is to find information assumed to be available on the web in a static form (Broder, 2002)

Users query search engines in order to accomplish tasks issuing multiple queries as they attempt to accomplish tasks (Jones and Klinkner, 2009)

TransactionalInformational

Moving from Queries to Sessions

At the session/task level, informational search can be:

Directed Search Exploratory SearchClosed-ended Open-endedSingle-faceted Multi-Faceted

In exploratory search, users generally combine querying and browsing strategies to foster learning and investigation (White and Roth, 2009)

Long Sessions: Exploring or Struggling?

• Exploring– Users are engaged in an open-ended and multi-faceted

information-seeking task to foster learning and discovery.

• Struggling – Users are experiencing difficulty locating the required

information. Note that struggling may not necessarily result in failure

Struggling

Long Sessions: Exploring or Struggling?

Exploring

Long Sessions: Exploring or Struggling?

Characterizing Exploring vs. Struggling BehaviorQuery Similarity

Q1 Q2 Q3 Q4 Q5 Q60

0.2

0.4

0.6

0.8

1

Exploring Struggling

Av

g.

qu

ery

sim

ila

rity

Queries are more different from the first query in struggling sessions.

Characterizing Exploring vs. Struggling BehaviorQueries Transition Strategies

Substitution Addition Removal0

0.4

0.8

1.2

1.6

Exploring Struggling

Avg

. n

um

ber

of

term

s

Adding Keyword, Removing Keywords, and Substituting Keywords(morphological variations, spelling corrections and Semantic variations).

When exploring, addition & removal are more popular and substitution is less popular.

Characterizing Exploring vs. Struggling BehaviorClicks

Q1 Q2 Q3 Q4 Q5 Q60.5

1

1.5

2

Exploring Struggling

Avg

. n

um

cli

cks

/ q

uer

yClicks are more when exploring. Difference gets larger as

the session progresses.

Characterizing Exploring vs. Struggling BehaviorDwell Time

Q1 Q2 Q3 Q4 Q5 Q60

20406080

100120140160180200

Exploring Struggling

Struggling - No Last Query

Avg

. d

wel

l ti

me

(sec

s)Dwell time is longer when exploring. Last query accounts for

a large proportion of dwell time when struggling.

Topics

DownloadSoftwareCooking

AutomotiveDirectories

RegionalEducation

Home ImprovementSports

Real EstateShopping

HealthRestaurants

TravelEvents

EmploymentInvesting

MusicLodgingPeople

DictionariesEntertainmentTravel Guides

Shopping Clothes

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%Exploring Struggling

Characterizing Exploring vs. Struggling BehaviorTopics

The likelihood of exploring vs. struggling varies significantly depending on the topic.

Predicting Session Type

Click FeaturesNumber of clicks in sessionNumber of clicks per queryNumber of queries with no clicksTotal dwell time in sessionDwell time per clickDwell time per queryTime to first clickNumber of unique clicked URLsNumber of unique clicked domains

Search History FeaturesNumber of query impressionsQuery clickthrough rateQuery success clickthrough rateQuery quickback clickthrough rateEntropy of click distribution

Topic FeaturesVisited URLs topicNumber of unique topics per sessionTopic distribution entropy

Query Transition FeaturesSimilarity between queriesNumber of terms that exactly match the previous queryNumber of added termsNumber of removed termsNumber of substituted termsNumber of query generalizationsNumber of query specifications

Query FeaturesNumber of queries issued in sessionQuery length in number of charactersQuery length in number of wordsTime between queriesNumber of manually typed queriesNumber of clicked queries

Exploring vs. Struggling

First Query Text

After 1st Query

After 2nd Query

After 3rd Query

End of Session

68

70

72

74

76

78

80

82

84

Acc

ura

cy(%

)

Prediction Accuracy improves as more behavioral information is available.

Feature Importance

10 points if feature ranked first, 9 if ranked second, etc. 0 points if ranked beyond the first 10 features

1st Query Text

After 1st Query

After 2nd Query

After 3rd Query

End of Session

0

2

4

6

8

10

Query Click Query Trans. Search History Topic

Fe

atu

re r

an

kin

gFeature contribution varies depending on the point where we make the prediction.

Implication on Success

• At a high level, user behavior in exploring and struggling is similar:– Multiple consecutive related queries

• Multiple queries is a good thing when exploring (engagement)

• Multiple queries is a bad thing when struggling (effort)

Success Prediction

66

70

74

78

Acc

ura

cy (

%)

Integrating the session type into search success models significantly improves performance.

Thanks!

Ahmed Hassan

hassanam@microsoft.com

top related