empowering users to access information in the digital library corin anderson university of...

20
Empowering users to access information in the Digital Library Corin Anderson University of Washington

Upload: lester-bruce

Post on 04-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Empowering users to access information in the Digital Library Corin Anderson University of Washington

Empowering users to access information in the Digital Library

Corin Anderson

University of Washington

Page 2: Empowering users to access information in the Digital Library Corin Anderson University of Washington

2

Empowering users

• DLs provide information to users

• Tricky: Not all users will be programmers– Non-programmer Web surfer– 5th grade student– Your grandmother

• How to cater to non-programming masses?

Page 3: Empowering users to access information in the Digital Library Corin Anderson University of Washington

3

The Digital Library

• Many specialized DLs exist today– Medicine, literature, etc.

• Eventually, all DLs will be integrated: The DL

• Until then, use the Web as approximation

Page 4: Empowering users to access information in the Digital Library Corin Anderson University of Washington

4

DL research at the UW

• Improving web search using popularity

• Automatic question answering

• Extracting information by demonstration

• Adaptive web sites

Page 5: Empowering users to access information in the Digital Library Corin Anderson University of Washington

5

Popularity-based web search

• Want authoritative pageswww.dodgeviper.com vs. www.homepages.com/~joe/my-car.html

• Approximate authority by freq. of web visits– Data gathered from NCSA web proxies

• Rank query results based on popularity1,000 hits www.dodgeviper.com 3 hits www.homepages.com/~joe/my-car.html

Page 6: Empowering users to access information in the Digital Library Corin Anderson University of Washington

6

Automatic question answering

• Return answers, not web pages, to queries“What’s the tallest mountain in the world?”www.mountainweb.com/mountaineering/worldmtns.htm vs.“Mount Everest”

• Search web for pages that contain answers–“the tallest mountain is”

• Heuristics for yes/no, “which is” questions

Page 7: Empowering users to access information in the Digital Library Corin Anderson University of Washington

7

Information extraction

• Info from DL usually used elsewhere– Query stock history to build a graph in Excel– Email a list of current movies to a friend

• Extracting info is tricky– Special file formats (XML, .csv) arcane– Custom built wrappers to select tuples– Building wrappers isn’t easy, either!

• Solution: demonstrate a wrapper

Page 8: Empowering users to access information in the Digital Library Corin Anderson University of Washington

8

ICE-9 Wrapper generation

• User demonstrates extracting info

• ICE-9 learns generalized program

• Demonstrate on very few instances

Page 9: Empowering users to access information in the Digital Library Corin Anderson University of Washington

9

ICE-9 in action

• User demonstrated two instances

• ICE-9 steps through program correctly

Page 10: Empowering users to access information in the Digital Library Corin Anderson University of Washington

10

ICE-9 – current work

• Collaborative demonstrationICE-9 predicts each step, asks for confirmation– If can’t predict with confidence, just ask user

• Active learning– ICE-9 suggests which example the user should

demonstrate

Page 11: Empowering users to access information in the Digital Library Corin Anderson University of Washington

11

Adaptive web sites

• Different users have different goals– But traditional web sites treat everyone the same– Everyone sees the same start page, query page, etc.

• Personalized sites can be customized– Customization is manual, tedious

• Want a site to learn users’ interest– Based on observed behavior, similarity to others– Adapt to individuals accordingly

Page 12: Empowering users to access information in the Digital Library Corin Anderson University of Washington

12

Adaptations – structural

• Add link, remove link

• Add page (index page synthesis)

Page 13: Empowering users to access information in the Digital Library Corin Anderson University of Washington

13

Adaptations – presentational

• Highlight link, content

Page 14: Empowering users to access information in the Digital Library Corin Anderson University of Washington

14

From users to adaptations

• Users are clustered to find related visitors

• Models are fit to clusters to predict behavior

• Adaptation space is searched for best changes

Page 15: Empowering users to access information in the Digital Library Corin Anderson University of Washington

15

AWS – current work

Building, clustering user models

• Hierarchical user clustering– Users are leaf nodes, related groups interior– Influence of parent nodes decreases with distance

• Selecting adaptations from models– Choosing structural changes– Defining, selecting presentational changes

Page 16: Empowering users to access information in the Digital Library Corin Anderson University of Washington

16

Summary

• Successful DLs cater to their users

• UW research concentrating on connecting users with information

• Look for us at IUI, KDD, ICML, AAAI, IJCAI, and elsewhere

Page 17: Empowering users to access information in the Digital Library Corin Anderson University of Washington

17

ICE-9 in action

• ICE-9 learns from subsequent instances– Probabilities now

100%

Page 18: Empowering users to access information in the Digital Library Corin Anderson University of Washington

18

ICE-9 – Version space algebra

Page 19: Empowering users to access information in the Digital Library Corin Anderson University of Washington

19

Selecting adaptations

• Cluster models analyzed to determine interests

“The user has an interest in the page”

“The user visits the page by starting at the page – add a link between the two.”

Page 20: Empowering users to access information in the Digital Library Corin Anderson University of Washington

20

User’s computer Date and time of visit

Requested page Referring page

some-pc.cs 22/Feb/2000 11:49:13 / -some-pc.cs 22/Feb/2000 11:49:23 /edu/ http://www.cs/some-pc.cs 22/Feb/2000 11:49:34 /edu/courses/ http://www.cs/edu/some-other-pc.cs 22/Feb/2000 11:49:55 / -some-pc.cs 22/Feb/2000 11:50:08 /574 http://www.cs/edu/courses/some-other-pc.cs 22/Feb/2000 11:50:20 /info/current/ http://www.cs/

<html>

</html>

<html>

</html>

<html>

</html>

<html>

</html>

<html>

</html> </html>

<html>