real-time web search: the road ahead

20
Real-Time Web Search The Road Ahead Sep. 2009 Jonghun Park [email protected] Seoul National Univ.

Upload: john-park

Post on 16-Apr-2017

596 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Real-Time Web Search: The Road Ahead

Real-Time Web SearchThe Road Ahead

Sep. 2009

Jonghun [email protected] Seoul National Univ.

Page 2: Real-Time Web Search: The Road Ahead

going real-time

Page 3: Real-Time Web Search: The Road Ahead

what is real-time search?

small delay between data creation & indexing

microblog search

status search

search against current hot queries

search in a small time window

Page 4: Real-Time Web Search: The Road Ahead

controversial examples

1 year old photo just uploaded

first report on Michael Jackson’s death

“At work...wish I was home..love my family”

“IRS” query on Apr. 14th

real-time monitoring

Page 5: Real-Time Web Search: The Road Ahead

real-time search, redefined

Retrieving Information with Time Value at Right Time

sports game scoresstock prices

celebrity updatesnew products

deals

home pageswikipedia pages

recipesold news articles

regulations

Page 6: Real-Time Web Search: The Road Ahead

opportunity

“what’s going on right now?”

$92B market

Page 7: Real-Time Web Search: The Road Ahead

underlying technologies

- RSS- Atom- ping - SUP- API- information extraction- crawling

pull

- XMPP- PubSubHubBub- tornado- comet - Six Apart Update Stream

push

Page 8: Real-Time Web Search: The Road Ahead

real-time search landscape

Page 9: Real-Time Web Search: The Road Ahead

not so informative

Page 10: Real-Time Web Search: The Road Ahead

lots of dittos and spams

Page 11: Real-Time Web Search: The Road Ahead

danger of drowning

Page 12: Real-Time Web Search: The Road Ahead

coverage of microblogs twitter: 23M+ monthly UV (compete.com) only 5% of twitter users accounts for 75% of all activity one quarter of all tweets are generated by bots twitter users are biased in terms of demographics

Page 13: Real-Time Web Search: The Road Ahead

lack of quality quality assessment through

identifying most popular, most authoritative, most linked-to, or most re-tweeted items

such kind of filtering will require further processing that will decrease the freshness of information

balancing the tension among recency, relevance, and quality is not an easy problem

Page 14: Real-Time Web Search: The Road Ahead

what’s desired measuring the quality of streaming sources instead of posts broader coverage

microblogs, blogs, public media, social media, and various casts beyond the simple buzz monitoring tool

topic focused, informative search results faster information discovery balancing users' needs to see results in real-time with

necessity to discover information from spam-free, quality sources

Page 15: Real-Time Web Search: The Road Ahead

feedmil.com

Page 16: Real-Time Web Search: The Road Ahead

feedmil approach po

pula

rity

Page 17: Real-Time Web Search: The Road Ahead

key characteristics

well knownsurprising

Page 18: Real-Time Web Search: The Road Ahead

feedmil.jp

Page 19: Real-Time Web Search: The Road Ahead

what’s next: from pull web to push web real-location real-event information filtering personalization intelligent stream discovery more breakthroughs in stream publishing & consumption

Page 20: Real-Time Web Search: The Road Ahead

Thank You

coming soon in oct. 2009