can social bookmarking improve web search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf ·...
TRANSCRIPT
![Page 1: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/1.jpg)
Can Social BookmarkingImprove Web Search?
Paul Heymann, Georgia Koutrika, and Hector Garcia-MolinaDepartment of Computer Science
Stanford University
February 12th, 2008
![Page 2: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/2.jpg)
Outline
Introduction
Problem Statement
Data Gathering Methodology
Analysis
Conclusions
![Page 3: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/3.jpg)
YouTube
![Page 4: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/4.jpg)
Amazon
![Page 5: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/5.jpg)
del.icio.us
![Page 6: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/6.jpg)
Outline
Introduction
Problem Statement
Data Gathering Methodology
Analysis
Conclusions
![Page 7: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/7.jpg)
Problem Statement
Can social bookmarkingimprove web search?
![Page 8: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/8.jpg)
Subproblems
SubproblemsAre there “enough” URLs?Are there “enough” tags?Are the URLs valuable?Are the tags redundant?
![Page 9: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/9.jpg)
Tags versus Other Content
![Page 10: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/10.jpg)
Outline
Introduction
Problem Statement
Data Gathering Methodology
Analysis
Conclusions
![Page 11: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/11.jpg)
del.icio.us posts
Bookmarks/Posts Triplespaul: news, uk→ bbc.co.uk (paul, news, bbc.co.uk)
08:33:25 (paul, uk, bbc.co.uk)
mary: recipes, food→ food.com (mary, recipes, food.com)08:33:23 (mary, food, food.com)
dave: tv, cnn, news→ cnn.com (dave, tv, cnn.com)08:33:21 (dave, cnn, cnn.com)
(dave, news, cnn.com)
![Page 12: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/12.jpg)
Realtime Web Crawling
![Page 13: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/13.jpg)
Outline
Introduction
Problem Statement
Data Gathering Methodology
Analysis
Conclusions
![Page 14: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/14.jpg)
Size and Growth≈ 120 thousand (≈ 105) posts/day
(versus ≈ 106 blog posts/day)60–150 million posts
12–75 million (≈ 107–108) unique URLs(versus ≈ 109–1011 total URLs)
Date
Est
imat
ed N
umbe
r of
Pos
ts
August 1, 2005 December 9, 2005 August 16, 2006
030
000
6000
090
000
1200
00
Date
Est
imat
ed N
umbe
r of
Pos
ts
November 11, 2006 July 5, 2007
3000
060
000
9000
012
0000
1500
00
![Page 15: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/15.jpg)
URL Indexing and Age
Found Initially 57.5%Indexed Within 4 Weeks 12.75%Indexed Within 6 Months 12.75%
Never Indexed 17%
Of the 57.5% found initially, modification time at time of post:
![Page 16: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/16.jpg)
URL Indexing and Age
Found Initially 57.5%Indexed Within 4 Weeks 12.75%Indexed Within 6 Months 12.75%
Never Indexed 17%
Of the 57.5% found initially, modification time at time of post:
![Page 17: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/17.jpg)
Tagging Caveats (“The Tagging 6”)
1. Title (16%)Examples: “oil”, “prices”
2. Whole Domain (20%)Examples: “news”, “cnn”
3. Page Text (50%)Example: “singapore”
4. Extended Text (80%)Example: “inflation”
5. Irrelevant (7%)Example: “stanford”
6. Subjective (<5%)Example: “funny”
![Page 18: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/18.jpg)
Tags versus Other Content
![Page 19: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/19.jpg)
Outline
Introduction
Problem Statement
Data Gathering Methodology
Analysis
Conclusions
![Page 20: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/20.jpg)
Conclusions
1. Social bookmarking URLs are new andrecent, though many tags may be redundant(given title, text, domains).
2. Social bookmarking is a large phenomenon,but not nearly as large as the web.
3. Despite this, relevant URLs are wellrepresented, and popular tags overlap withpopular queries.
Questions?Check out the full paper athttp://dbpubs.stanford.edu/
or in the proceedings!
![Page 21: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/21.jpg)
Conclusions
1. Social bookmarking URLs are new andrecent, though many tags may be redundant(given title, text, domains).
2. Social bookmarking is a large phenomenon,but not nearly as large as the web.
3. Despite this, relevant URLs are wellrepresented, and popular tags overlap withpopular queries.
Questions?Check out the full paper athttp://dbpubs.stanford.edu/
or in the proceedings!
![Page 22: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/22.jpg)
Conclusions
1. Social bookmarking URLs are new andrecent, though many tags may be redundant(given title, text, domains).
2. Social bookmarking is a large phenomenon,but not nearly as large as the web.
3. Despite this, relevant URLs are wellrepresented, and popular tags overlap withpopular queries.
Questions?Check out the full paper athttp://dbpubs.stanford.edu/
or in the proceedings!
![Page 23: Can Social Bookmarking Improve Web Search?cdn.paulheymann.com/stanford/wsdm_talk_20080212.pdf · 12/02/2008 · Can Social Bookmarking Improve Web Search? Paul Heymann, Georgia Koutrika,](https://reader033.vdocuments.site/reader033/viewer/2022060521/60505e8dd4dd7d000272f3f3/html5/thumbnails/23.jpg)
Conclusions
1. Social bookmarking URLs are new andrecent, though many tags may be redundant(given title, text, domains).
2. Social bookmarking is a large phenomenon,but not nearly as large as the web.
3. Despite this, relevant URLs are wellrepresented, and popular tags overlap withpopular queries.
Questions?Check out the full paper athttp://dbpubs.stanford.edu/
or in the proceedings!