explorations in tag suggestion and query expansion jian wang and brian d. davison lehigh university,...

20
Explorations in Explorations in Tag Suggestion and Tag Suggestion and Query Expansion Query Expansion Jian Wang and Brian D. Davison Jian Wang and Brian D. Davison Lehigh University, USA Lehigh University, USA SSM 2008 SSM 2008 (Workshop on Search in Social Media) (Workshop on Search in Social Media)

Post on 18-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Explorations in Explorations in Tag Suggestion and Tag Suggestion and Query ExpansionQuery ExpansionJian Wang and Brian D. DavisonJian Wang and Brian D. Davison

Lehigh University, USALehigh University, USA

SSM 2008SSM 2008(Workshop on Search in Social Media)(Workshop on Search in Social Media)

22

OutlineOutline

IntroductionIntroduction

Auto-tagging SystemAuto-tagging System

Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion

User StudyUser Study

ConclusionConclusion

33

OutlineOutline

IntroductionIntroduction

Auto-tagging SystemAuto-tagging System

Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion

User StudyUser Study

ConclusionConclusion

44

IntroductionIntroduction

Factors of reducing the quality of search resultsFactors of reducing the quality of search results– Query ambiguityQuery ambiguity

Different needs result in the same queryDifferent needs result in the same queryex: ex: javajava may imply may imply java tutorialjava tutorial or or java softwarejava software

– Vocabulary mismatchVocabulary mismatch

– Lack of knowledge regarding document contentsLack of knowledge regarding document contents

55

MotivationMotivation

Query expansionQuery expansion / suggestion / suggestion– Assist users to issue better queriesAssist users to issue better queries

Using tags for query expansionUsing tags for query expansion– Tags reflect various users’ perspectives about a Tags reflect various users’ perspectives about a

unique article.unique article.

– Different from traditional topical categorization, such as Different from traditional topical categorization, such as ODP or Yahoo!, tags are much easier to understandODP or Yahoo!, tags are much easier to understand

Automatic taggingAutomatic tagging– Assign tags to those documents not taggedAssign tags to those documents not tagged

66

OutlineOutline

IntroductionIntroduction

Auto-tagging SystemAuto-tagging System

Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion

User StudyUser Study

ConclusionConclusion

77

Auto-tagging SystemAuto-tagging System

SVM for multi-class classificationSVM for multi-class classification– A set of binary classifiersA set of binary classifiers

– Features: webpage textFeatures: webpage text

DataData– 500 pages for each of 140 most popular tags in 500 pages for each of 140 most popular tags in

DeliciousDelicious

– Training: for each tag, Training: for each tag, MM positive and positive and KK negative negative documentsdocuments

– Testing: 503 webpages. Testing: 503 webpages.

– All documents are greater than 800 bytesAll documents are greater than 800 bytes

88

Classification ResultsClassification Results

99

Classification ResultsClassification Results

Precision > RecallPrecision > Recall

““String distance” reduces the affect of minor differences in tags String distance” reduces the affect of minor differences in tags ex: ex: blogsblogs and and bloggingblogging

Only 328 webpages are evaluated by “String distance with Only 328 webpages are evaluated by “String distance with min3”min3”

1010

Comparison with Comparison with Related WorksRelated Works AutoTag (WWW 2006)AutoTag (WWW 2006)

– A collaborative filtering approach for tagging weblog postsA collaborative filtering approach for tagging weblog posts– Evaluated with “String distance with min3”Evaluated with “String distance with min3”– Boosted by weblog author’s tagsBoosted by weblog author’s tags– 0.40 in Precision@10, 0.49 in [email protected] in Precision@10, 0.49 in recall@10

TagAssist (ICWSM 2007)TagAssist (ICWSM 2007)– The approach is similar to AutoTagThe approach is similar to AutoTag– ““Exact word match”Exact word match”– Manual evaluation: 42.10% accuracyManual evaluation: 42.10% accuracy– Automatic evaluation: 13.11% precision, 22.83% recallAutomatic evaluation: 13.11% precision, 22.83% recall

1111

OutlineOutline

IntroductionIntroduction

Auto-tagging SystemAuto-tagging System

Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion

User StudyUser Study

ConclusionConclusion

1212

The StepsThe Steps

1.1. Submit initial query to search engineSubmit initial query to search engine

2.2. Auto tagging top documents of initial resultAuto tagging top documents of initial result

3.3. Using Using popularpopular tags to expand initial query tags to expand initial query

4.4. a. User select query suggestions manually; a. User select query suggestions manually; (Tag Suggestion System)(Tag Suggestion System)

b. Automatically combine retrieval results of b. Automatically combine retrieval results of various suggestions (Tag Auto-combine Sys.)various suggestions (Tag Auto-combine Sys.)

1313

OutlineOutline

IntroductionIntroduction

Auto-tagging SystemAuto-tagging System

Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion

User StudyUser Study

ConclusionConclusion

1414

First Step of User First Step of User StudyStudy Use Google as the search engineUse Google as the search engine

20 unique queries from 4 graduate students20 unique queries from 4 graduate students

Top 50 returned pages are tagged automaticallyTop 50 returned pages are tagged automatically

Expanding initial query with top 8 popular tags to Expanding initial query with top 8 popular tags to generate 8 suggestionsgenerate 8 suggestions

Compared with suggestions from Google and Compared with suggestions from Google and Yahoo!Yahoo!

User select preferable suggestions in a random User select preferable suggestions in a random anonymous manneranonymous manner

1515

Results of User StudyResults of User Study

1616

Suggestion ExamplesSuggestion Examples

1717

Second Step of User Second Step of User StudyStudy Submit modified query to GoogleSubmit modified query to Google

Scoring top-10 lists by user as 0-5, for all systemsScoring top-10 lists by user as 0-5, for all systems

Metrics for comparisonMetrics for comparison– Metric 1 (average relative improvement)Metric 1 (average relative improvement)

– Metric 2 (improvement for average relevance score)Metric 2 (improvement for average relevance score)

– Metric 3 (average relative improvement)Metric 3 (average relative improvement)

1818

Results of User StudyResults of User Study

Significant improvement over Google initial resultSignificant improvement over Google initial result

1919

OutlineOutline

IntroductionIntroduction

Auto-tagging SystemAuto-tagging System

Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion

User StudyUser Study

ConclusionConclusion

2020

ConclusionConclusion

We build an auto-tagging system to assign tags to We build an auto-tagging system to assign tags to webpages.webpages.

Our approach focuses exclusively on the textual content, Our approach focuses exclusively on the textual content, and thus is applicable when no usage information is and thus is applicable when no usage information is available.available.

Our system utilizes characteristics of tags to expand Our system utilizes characteristics of tags to expand query and provide options for users.query and provide options for users.

A user study is performed, showing better performance A user study is performed, showing better performance than Google suggestion and Yahoo! suggestion.than Google suggestion and Yahoo! suggestion.