recommenders everywhere: the wikilens community-maintained recommender system dan frankowski, shyong...
TRANSCRIPT
Recommenders Everywhere: The WikiLens Community-Maintained
Recommender System
Dan Frankowski, Shyong K. (Tony) Lam, Shilad Sen,F. Maxwell Harper, Scott Yilek, Michael Cassano, John Riedl
University of Minnesota
What Is a Recommender? A personalized recommender recommends
items based on your personal preferences Amazon: “If you like A, you might like B” (because
80% of people who bought A also bought B) Combining your As => personalized list of Bs
Uses collaborative filtering algorithms, e.g.,
combining ratings of users like you
combining ratings of items similar to those you rate
Requires many users and many ratings
A Recommender System
movielens.org
Started by GroupLens in 1995
120K users (several thousand active in a given month)
9K movies
13M ratings
No beer.
Tools for community-maintained sites Suppose our beer lover wants to start a community site
Wikis (many – MediaWikis, editme.com) Forums (millions – phpBB) Blogs (many millions – technorati tracks 108M)
How to start a recommender for beer? Fueled by community contribution?
We propose community-maintained recommenders, where users contribute all the content and information needed to recommend content
Small-world recommenders Traditional recommender algorithms need
large: many users, many ratings Most online communities are small
We propose small-world recommenders Provide value with little data per item Depend on users to understand other users Allow users to see specific individuals’
preferences Aggregate user preferences into
recommendations
Why a new system? We looked for an existing system We found
Libraries (Taste, MultiLens, Suggest, …) Web services (easyutil.com) Research (no community-maintained
recommenders)
Where are the off-the-shelf systems? Hosted: Wikipedia, editme.com Downloadable: Mediawiki
WikiLens
Not just beer
Asked about WikiLens:
anime-planet.com
frenchtowner.com
course/teacher recs
academic projects
movielens users (for books)
…
Principle: FIND Beeradvocate.com has 32,000 beers Anime planet has 1000s works of anime
FIND: Members should be able to find items that interest them
Information filtering is complex (Malone 1987) cognitive (factual details) economic (estimating cost/benefit) social (friends, the crowd)
Principle: ADD There’s a lot of interest in little-known items
“the market for books that are not even sold in the average bookstore is larger than the market for those that are.” (Anderson 2004)
Principle: ADD There’s a lot of interest in little-known items
“the market for books that are not even sold in the average bookstore is larger than the market for those that are.” (Anderson 2004)
People work harder for immediate satisfaction MovieLens members who saw their added movies
immediately did more work than those who only saw their movies added after review. (Cosley 2005)
ADD: Members should be able to add items immediately
Principle: DEEP CHANGE Our beer-lover wants a beer-centric system Information common to each beer
Fields: style, brewer, alcohol content
Principle: DEEP CHANGE Our beer-lover wants a beer-centric system Information common to each beer
Fields: style, brewer, alcohol content
Why not use a Content Management System? They support fields, but don’t support ADD
Power to the people: the community can do amazing things (Wikipedia)
DEEP CHANGE: Members should be able to uniquely identify items, and define and redefine their attributes and organization
Principle: MICRO-CONTRIBUTE MovieLens users: rating is fun
54% said it was a top 3 reason to rate
(Bryant and Forte): Small starter tasks may be a path for a casual contributor to become a more involved one
MICRO-CONTRIBUTE: Members should be able to make small contributions
Principle: SEE OTHERS “I’ll get by with a little help from my friends”
Every collaborative system should allow you to see other people (Erickson 2000) social translucence (systems supporting visibility,
awareness, and accountability) is a “fundamental requirement for supporting all types of communication and collaboration.”
SEE OTHERS: Members should be able to see each other and their contributions
Rebuilding beeradvocate?
Sure! Sort of, but ..
Other communities have the same needs
General (not just beer) Anyone can start a new community More power to the community: ADD,
DEEP CHANGE With a personalized recommender
Predicted value of an item Weighted average of buddy ratings and
overall average rating Not like traditional collaborative filtering
We believed in buddies We thought traditional algorithms would be
too noisy with little data
System Design (DEEP CHANGE) A page is in a category (ex: “Beer”) A category can have fields (ex: style)
System Design (FIND)
Selecting: browsing, searching, filtering, ordering
Evaluating: item details, predictions, averages, buddy ratings, comments, page text
System Design (SEE OTHERS) Buddies
On item pages On category page (predictions, “likes”)
User pages (profiles and ratings) Comments Rating averages Recent changes
System Design – wiki or not? Wiki
Any user may edit items or categories Data (including fields) is versioned Recent changes
Not Structured data fields with special editor Ratings Category with pages sorted by prediction
Experiences – wikilens.org stats wikilens.org, April 2004 – Oct 2006
231 users 4,430 items 17,271 ratings
Experiences (ADD) Lesson: Users will add items
43% of users added items (99 of 231)
Lesson: Broadening community of contributors is useful Each category’s top contributor only
contributed a few of the top-rated Ex: “MovieMaven” added 69% of movies
(1357 of 1967), but only 3 of top-rated 25
“MovieMaven” has #20, 21, 251. Matrix, The (1999)2. Amelie3. Star Wars: Episode V - The Empire Strikes Back4. Star Wars: Episode IV - A New Hope5. Star Wars: Episode VI - Return of the Jedi6. Being John Malkovich (1999)7. Shawshank Redemption, The (1994)8. Fight Club9. Casablanca10. Bladerunner…20. Eternal Sunshine of the Spotless Mind (2004)21. American Beauty25. Truman Show, The (1998)
“MovieMaven” Adding 1357 movies => 12 hours!
“I did it the old fashioned way, line by line, allowing myself to become a bit too obsessed by the whole thing!”
97% of the movies he entered he had already rated in MovieLens!
“I really love the opportunity to add whatever you'd like in the film category .. It makes the site unique among its kind, at least as far as I know”
Experiences (DEEP CHANGE) Lesson: Users understand and change
categories and fields
We avoided “Movie” category, but users added it and its fields anyway
Experiences (MICRO-CONTRIBUTE)
Lesson: WikiLens supports a range of contributions, and the easiest things are participated in widely
Most users rated (86%) Almost half added an item (43%) A few power users changed category fields
(7%, 3% of them non-GroupLens)
Experiences (FIND) Lesson: Category pages were hubs of
browsing 6 of top 10 pages browsed by logged-in users were
category pages (Movie, Album, ...)
User survey in Nov 2006 (37 responses) They use WikiLens to ‘find new items to learn more
about’ (81%) They find items by a category page (65%) They evaluate items based on prediction value on
the category page (65%)
Experiences (FIND) Lesson: Traditional collaborative
filtering is possible in small datasets Simulation using item-based collaborative
filtering 80% users as training set, 20% as test set For test users, use 80% of ratings to recommend Measure recall of the 20%
Surprise: collaborative filtering improves recall even for the wikilens.org dataset (small by traditional standards)
Experiences (SEE OTHERS) Lesson: Buddies were mostly used by
preexisting social groups
Average # buddies in GroupLens: 8.8 Average # buddies non-GroupLens: 2.8
(users with at least 1 buddy)
Possible Improvements: RECS Challenge: Users use WikiLens to find
new items, but get average-based recommendations if they don’t have buddies
Improvement: Implement a personalized recommender for users without buddies (suitable for the small world)
Possible Improvements: ORGANIZATION
Challenge: Users used WikiLens to ‘keep track of items I like or dislike’ (64%), but organizing items is hard Ex: Restaurant
Boston, Bay Area, New York, Chicago, …
Improvement: Implement hierarchical categories
Possible Improvements: USABILITY
Challenge: wikilens.org could use more contribution At least one survey user said the interface is
confusing A few users make accounts but do not rate
anything
Improvement: Make more usable, more sociable, give more incentives to contribute
Possible Improvements: TECHNOLOGY
Challenge: There are more people who want to install WikiLens than do Frenchtowner complained about the look
Improvement: Make it easier to install and change look and feel
Possible Improvements: TECHNOLOGY
Challenge: It is hard to keep wikilens.org fast
Improvement: Re-architect for fast recommendations
Challenge: It is hard to keep wikilens.org unbroken
Improvement: Make code easier to change (PHP?)
Conclusion: What Have We Learned? We propose community-maintained
recommenders that support the small world (BeerLens)
Five principles: ADD, DEEP CHANGE, MICRO-CONTRIBUTE, FIND, SEE OTHERS
Features based on these principles: item pages, fields, ratings, category pages, buddies, …
Our experiences supported many of these proposals
There is much room for improvement
Thanks!
This work is supported by NSF grantsIIS 03-24851 and IIS 05-34420
Google funded my trip to WikiSym Email: [email protected] See http://www.wikilens.org