explorations in tag suggestion and query expansion jian wang and brian d. davison lehigh university,...
Post on 18-Dec-2015
218 views
TRANSCRIPT
Explorations in Explorations in Tag Suggestion and Tag Suggestion and Query ExpansionQuery ExpansionJian Wang and Brian D. DavisonJian Wang and Brian D. Davison
Lehigh University, USALehigh University, USA
SSM 2008SSM 2008(Workshop on Search in Social Media)(Workshop on Search in Social Media)
22
OutlineOutline
IntroductionIntroduction
Auto-tagging SystemAuto-tagging System
Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion
User StudyUser Study
ConclusionConclusion
33
OutlineOutline
IntroductionIntroduction
Auto-tagging SystemAuto-tagging System
Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion
User StudyUser Study
ConclusionConclusion
44
IntroductionIntroduction
Factors of reducing the quality of search resultsFactors of reducing the quality of search results– Query ambiguityQuery ambiguity
Different needs result in the same queryDifferent needs result in the same queryex: ex: javajava may imply may imply java tutorialjava tutorial or or java softwarejava software
– Vocabulary mismatchVocabulary mismatch
– Lack of knowledge regarding document contentsLack of knowledge regarding document contents
55
MotivationMotivation
Query expansionQuery expansion / suggestion / suggestion– Assist users to issue better queriesAssist users to issue better queries
Using tags for query expansionUsing tags for query expansion– Tags reflect various users’ perspectives about a Tags reflect various users’ perspectives about a
unique article.unique article.
– Different from traditional topical categorization, such as Different from traditional topical categorization, such as ODP or Yahoo!, tags are much easier to understandODP or Yahoo!, tags are much easier to understand
Automatic taggingAutomatic tagging– Assign tags to those documents not taggedAssign tags to those documents not tagged
66
OutlineOutline
IntroductionIntroduction
Auto-tagging SystemAuto-tagging System
Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion
User StudyUser Study
ConclusionConclusion
77
Auto-tagging SystemAuto-tagging System
SVM for multi-class classificationSVM for multi-class classification– A set of binary classifiersA set of binary classifiers
– Features: webpage textFeatures: webpage text
DataData– 500 pages for each of 140 most popular tags in 500 pages for each of 140 most popular tags in
DeliciousDelicious
– Training: for each tag, Training: for each tag, MM positive and positive and KK negative negative documentsdocuments
– Testing: 503 webpages. Testing: 503 webpages.
– All documents are greater than 800 bytesAll documents are greater than 800 bytes
99
Classification ResultsClassification Results
Precision > RecallPrecision > Recall
““String distance” reduces the affect of minor differences in tags String distance” reduces the affect of minor differences in tags ex: ex: blogsblogs and and bloggingblogging
Only 328 webpages are evaluated by “String distance with Only 328 webpages are evaluated by “String distance with min3”min3”
1010
Comparison with Comparison with Related WorksRelated Works AutoTag (WWW 2006)AutoTag (WWW 2006)
– A collaborative filtering approach for tagging weblog postsA collaborative filtering approach for tagging weblog posts– Evaluated with “String distance with min3”Evaluated with “String distance with min3”– Boosted by weblog author’s tagsBoosted by weblog author’s tags– 0.40 in Precision@10, 0.49 in [email protected] in Precision@10, 0.49 in recall@10
TagAssist (ICWSM 2007)TagAssist (ICWSM 2007)– The approach is similar to AutoTagThe approach is similar to AutoTag– ““Exact word match”Exact word match”– Manual evaluation: 42.10% accuracyManual evaluation: 42.10% accuracy– Automatic evaluation: 13.11% precision, 22.83% recallAutomatic evaluation: 13.11% precision, 22.83% recall
1111
OutlineOutline
IntroductionIntroduction
Auto-tagging SystemAuto-tagging System
Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion
User StudyUser Study
ConclusionConclusion
1212
The StepsThe Steps
1.1. Submit initial query to search engineSubmit initial query to search engine
2.2. Auto tagging top documents of initial resultAuto tagging top documents of initial result
3.3. Using Using popularpopular tags to expand initial query tags to expand initial query
4.4. a. User select query suggestions manually; a. User select query suggestions manually; (Tag Suggestion System)(Tag Suggestion System)
b. Automatically combine retrieval results of b. Automatically combine retrieval results of various suggestions (Tag Auto-combine Sys.)various suggestions (Tag Auto-combine Sys.)
1313
OutlineOutline
IntroductionIntroduction
Auto-tagging SystemAuto-tagging System
Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion
User StudyUser Study
ConclusionConclusion
1414
First Step of User First Step of User StudyStudy Use Google as the search engineUse Google as the search engine
20 unique queries from 4 graduate students20 unique queries from 4 graduate students
Top 50 returned pages are tagged automaticallyTop 50 returned pages are tagged automatically
Expanding initial query with top 8 popular tags to Expanding initial query with top 8 popular tags to generate 8 suggestionsgenerate 8 suggestions
Compared with suggestions from Google and Compared with suggestions from Google and Yahoo!Yahoo!
User select preferable suggestions in a random User select preferable suggestions in a random anonymous manneranonymous manner
1717
Second Step of User Second Step of User StudyStudy Submit modified query to GoogleSubmit modified query to Google
Scoring top-10 lists by user as 0-5, for all systemsScoring top-10 lists by user as 0-5, for all systems
Metrics for comparisonMetrics for comparison– Metric 1 (average relative improvement)Metric 1 (average relative improvement)
– Metric 2 (improvement for average relevance score)Metric 2 (improvement for average relevance score)
– Metric 3 (average relative improvement)Metric 3 (average relative improvement)
1818
Results of User StudyResults of User Study
Significant improvement over Google initial resultSignificant improvement over Google initial result
1919
OutlineOutline
IntroductionIntroduction
Auto-tagging SystemAuto-tagging System
Tag Suggestion for Query ExpansionTag Suggestion for Query Expansion
User StudyUser Study
ConclusionConclusion
2020
ConclusionConclusion
We build an auto-tagging system to assign tags to We build an auto-tagging system to assign tags to webpages.webpages.
Our approach focuses exclusively on the textual content, Our approach focuses exclusively on the textual content, and thus is applicable when no usage information is and thus is applicable when no usage information is available.available.
Our system utilizes characteristics of tags to expand Our system utilizes characteristics of tags to expand query and provide options for users.query and provide options for users.
A user study is performed, showing better performance A user study is performed, showing better performance than Google suggestion and Yahoo! suggestion.than Google suggestion and Yahoo! suggestion.