empowering users to access information in the digital library corin anderson university of...
TRANSCRIPT
Empowering users to access information in the Digital Library
Corin Anderson
University of Washington
2
Empowering users
• DLs provide information to users
• Tricky: Not all users will be programmers– Non-programmer Web surfer– 5th grade student– Your grandmother
• How to cater to non-programming masses?
3
The Digital Library
• Many specialized DLs exist today– Medicine, literature, etc.
• Eventually, all DLs will be integrated: The DL
• Until then, use the Web as approximation
4
DL research at the UW
• Improving web search using popularity
• Automatic question answering
• Extracting information by demonstration
• Adaptive web sites
5
Popularity-based web search
• Want authoritative pageswww.dodgeviper.com vs. www.homepages.com/~joe/my-car.html
• Approximate authority by freq. of web visits– Data gathered from NCSA web proxies
• Rank query results based on popularity1,000 hits www.dodgeviper.com 3 hits www.homepages.com/~joe/my-car.html
6
Automatic question answering
• Return answers, not web pages, to queries“What’s the tallest mountain in the world?”www.mountainweb.com/mountaineering/worldmtns.htm vs.“Mount Everest”
• Search web for pages that contain answers–“the tallest mountain is”
• Heuristics for yes/no, “which is” questions
7
Information extraction
• Info from DL usually used elsewhere– Query stock history to build a graph in Excel– Email a list of current movies to a friend
• Extracting info is tricky– Special file formats (XML, .csv) arcane– Custom built wrappers to select tuples– Building wrappers isn’t easy, either!
• Solution: demonstrate a wrapper
8
ICE-9 Wrapper generation
• User demonstrates extracting info
• ICE-9 learns generalized program
• Demonstrate on very few instances
9
ICE-9 in action
• User demonstrated two instances
• ICE-9 steps through program correctly
10
ICE-9 – current work
• Collaborative demonstrationICE-9 predicts each step, asks for confirmation– If can’t predict with confidence, just ask user
• Active learning– ICE-9 suggests which example the user should
demonstrate
11
Adaptive web sites
• Different users have different goals– But traditional web sites treat everyone the same– Everyone sees the same start page, query page, etc.
• Personalized sites can be customized– Customization is manual, tedious
• Want a site to learn users’ interest– Based on observed behavior, similarity to others– Adapt to individuals accordingly
12
Adaptations – structural
• Add link, remove link
• Add page (index page synthesis)
13
Adaptations – presentational
• Highlight link, content
14
From users to adaptations
• Users are clustered to find related visitors
• Models are fit to clusters to predict behavior
• Adaptation space is searched for best changes
15
AWS – current work
Building, clustering user models
• Hierarchical user clustering– Users are leaf nodes, related groups interior– Influence of parent nodes decreases with distance
• Selecting adaptations from models– Choosing structural changes– Defining, selecting presentational changes
16
Summary
• Successful DLs cater to their users
• UW research concentrating on connecting users with information
• Look for us at IUI, KDD, ICML, AAAI, IJCAI, and elsewhere
17
ICE-9 in action
• ICE-9 learns from subsequent instances– Probabilities now
100%
18
ICE-9 – Version space algebra
19
Selecting adaptations
• Cluster models analyzed to determine interests
“The user has an interest in the page”
“The user visits the page by starting at the page – add a link between the two.”
20
User’s computer Date and time of visit
Requested page Referring page
some-pc.cs 22/Feb/2000 11:49:13 / -some-pc.cs 22/Feb/2000 11:49:23 /edu/ http://www.cs/some-pc.cs 22/Feb/2000 11:49:34 /edu/courses/ http://www.cs/edu/some-other-pc.cs 22/Feb/2000 11:49:55 / -some-pc.cs 22/Feb/2000 11:50:08 /574 http://www.cs/edu/courses/some-other-pc.cs 22/Feb/2000 11:50:20 /info/current/ http://www.cs/
<html>
</html>
<html>
</html>
<html>
</html>
<html>
</html>
<html>
</html> </html>
<html>