real-time web search: the road ahead
TRANSCRIPT
going real-time
what is real-time search?
small delay between data creation & indexing
microblog search
status search
search against current hot queries
search in a small time window
controversial examples
1 year old photo just uploaded
first report on Michael Jackson’s death
“At work...wish I was home..love my family”
“IRS” query on Apr. 14th
real-time monitoring
real-time search, redefined
Retrieving Information with Time Value at Right Time
sports game scoresstock prices
celebrity updatesnew products
deals
home pageswikipedia pages
recipesold news articles
regulations
opportunity
“what’s going on right now?”
$92B market
underlying technologies
- RSS- Atom- ping - SUP- API- information extraction- crawling
pull
- XMPP- PubSubHubBub- tornado- comet - Six Apart Update Stream
push
real-time search landscape
not so informative
lots of dittos and spams
danger of drowning
coverage of microblogs twitter: 23M+ monthly UV (compete.com) only 5% of twitter users accounts for 75% of all activity one quarter of all tweets are generated by bots twitter users are biased in terms of demographics
lack of quality quality assessment through
identifying most popular, most authoritative, most linked-to, or most re-tweeted items
such kind of filtering will require further processing that will decrease the freshness of information
balancing the tension among recency, relevance, and quality is not an easy problem
what’s desired measuring the quality of streaming sources instead of posts broader coverage
microblogs, blogs, public media, social media, and various casts beyond the simple buzz monitoring tool
topic focused, informative search results faster information discovery balancing users' needs to see results in real-time with
necessity to discover information from spam-free, quality sources
feedmil.com
feedmil approach po
pula
rity
key characteristics
well knownsurprising
feedmil.jp
what’s next: from pull web to push web real-location real-event information filtering personalization intelligent stream discovery more breakthroughs in stream publishing & consumption
Thank You
coming soon in oct. 2009