personalizing the web for mobile users corin anderson pedro domingos dan weld
DESCRIPTION
3 Doesn’t work on mobile devices Very few lines of text –Lots of scrolling Few pixels, colors –Hard to find desired link Slow net connectivity –Following links slow No client-side scripting –Page functionality lostTRANSCRIPT
2
The PC-centric web•Assume 1024x768
– Micromanage page layout
– Lots of real estate for important links
•Assume fast net– Images are okay– Following links okay
•Assume fast CPU– Client-side Javascript
3
Doesn’t work on mobile devices
• Very few lines of text– Lots of scrolling
• Few pixels, colors– Hard to find desired link
• Slow net connectivity– Following links slow
• No client-side scripting– Page functionality lost
4
First approach: syntactic translation
• Tag-by-tag, transcode for small screen– Replace images with alt text– Flatten <table>s
• Problem: not all content is created equally– Shortcut links are useful– Links to “Our sponsors”
aren’t– Lacks awareness of needs
of each visitor
5
Our approach: web personalizers
• An intermediary between server and user• Personalizes content for each user• Personalizations include:
– Making frequently-visited pages easier to reach– Highlighting content of interest– Eliding links and content of no interest
6
A personalizer in context
GET / <html>
palm.net
www.com
AccessLogs
Personalizer
UserModels
Personalizecontent
Modelusers
GET /<html>
7
Our implementation: Proteus
1. Build user model– Sequence of pages visited– Textual content viewed
2. Search space of web sites– “State” is personalized web site– Evaluate site with expected utility for user
8
User model – raw data• Sequence of pages requested
– Server or proxy logs (“Corey, the proxy’s down…”)
– Assume 1 computer :: 1 user– Temporally-close requests are related (sessions)
• Text of each page
some-pc.cs 22/Feb/2000 11:49:13 - /some-pc.cs 22/Feb/2000 11:49:23 www.cs/ /education/some-pc.cs 22/Feb/2000 11:49:34 www.cs/education/ /education/courses/some-other-pc.cs 22/Feb/2000 11:49:55 - /some-pc.cs 22/Feb/2000 11:50:08 www.cs/education/courses/ /education/courses/574some-other-pc.cs 22/Feb/2000 11:50:20 www.cs/ /info/current/some-pc.cs 22/Feb/2000 12:12:36 www.cs/ /lab/
9
User model – derived data
• Frequency of page visits– # times user visited each page
• Probability of page visit– # times visited page / # sessions
• Content word vector– “interest” value in each word seen
10
Site evaluation
cnet helpnews freehardware emaildownload clickbuilder heregame mondayjob novemberauction advertisementpricetech
11
Expected utility
• Value of site = expected utility of site• E[U(site)] = E[U(first page)]• E[U(page)] = E[U(first screen)]
• Utility from screen content: intrinsic utility• Utility from screen navigation: extrinsic utility
12
Intrinsic utility
• Content similarity – Words on screen vs. words seen before– Scale using TFIDF– Scale by position in session – later pages’ words
worth more• Frequency of visits
– how often user views screen
13
Extrinsic utility
• User may take one or many actions– Follow link L1, follow link L2, …– Scroll down
• Each action leads to new screen: E[U(dest.)]• Actions have probabilities: P(action)• Actions have cost: (action)
))()]([)(()]([ 1, scrollsUEscrollPseuE jiij
k
kkk LdestLUELP ))]()].([)(([
14
Search for optimal site
• Search control– Steepest descent– Run about 20 iterations
• Search operators– Add shortcut link– Elide content
16
Elide content
• Remove unnecessary content• Don’t make irrevocable changes!
– Replace content with link to original
13 lines
1 line
17
Exp’t 1: live human subjects
• Observe real users on the desktop– Behavior based on seeded questions
• Measure performance on mobile device– Unmodified vs. personalized sites– Tasks taken from same seed distribution
Total links followed
0
1
2
3
4
5
6
7
cs.w
ashin
gton.e
du
cnet.
com
cs.w
ashin
gton.e
du
ebay
.com
finan
ce.ya
hoo.c
om
cnn.c
om
cs.w
ashin
gton.e
du
finan
ce.ya
hoo.c
om
cnn.c
om
cs.w
ashin
gton.e
du
finan
ce.ya
hoo.c
om
ebay
.com
Information-goal location (in chronological order)
# lin
ks
UnmodifiedPersonalized
19
Analysis of exp’t 1
• Content elision overly aggressive– Proteus favors words on page over links on page
• Elision links inconspicuous– No different from normal links
• Shortcut links underused– Proteus didn’t always make them – Users didn’t scroll enough to find them
20
Exp’t 2: Simulated users
• Deterministic automaton• “Scripts” made from sessionized logs
– Use non-seeded behavior from exp’t 1’s logs– Goal: get to last page in session
• Sim user follows script, except:– If a shortcut link is there and useful, take it– If the next link has been elided, find it
Total navigation required
5.58
2.38
2.88
1.65
0
1
2
3
4
5
6
7
8
Links Scrolls
# ac
tions
UnmodifiedPersonalized
Average percent actions saved
0.32
0.17
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Percent links saved Percent scrolls saved
Perc
ent s
aved
23
Continuing work: clustering
• Single user model may be starved for data• But we have many user models available• Cluster users, then build cluster models• Site evaluation depends on user and cluster
26
Questions about clustering
• Is model-based clustering the right choice?• What model should we use?
– 0th, 1st, 2nd order Markov• Can we model at page granularity?• What about non-sequence user attributes
– User domain, registration information
27
Questions about Personalizers
• What other personalizations would be useful?– Create new pages?– Adapt layout of page?
• How should we measure intrinsic utility?– Word vectors? How are the terms scaled?– Past-visit dwell time?
28
More questions
• How can we make search faster?– Evaluation is slow – can we approximate?– Search infeasible as sites, adaptations scale
• How do we evaluate Proteus?– User studies are tricky– Can simulated web behavior be believable?
30
Conclusions
• Mobile web experience must be personalized• Personalizers have many benefits• Proteus successfully improves mobile web
experience
Total time required
0
50
100
150
200
250
300
350
cs.w
ashin
gton.e
du
cnet.
com
cs.w
ashin
gton.e
du
ebay
.com
finan
ce.ya
hoo.c
om
cnn.c
om
cs.w
ashin
gton.e
du
finan
ce.ya
hoo.c
om
cnn.c
om
cs.w
ashin
gton.e
du
finan
ce.ya
hoo.c
om
ebay
.com
Information-goal location (in chronological order)
Tim
e (s
econ
ds)
UnmodifiedPersonalized
Average number of actions saved
0.73
2.70
0
0.5
1
1.5
2
2.5
3
3.5
4
Links saved Scrolls saved
# ac
tions