improving intelligent assistants for desktop activities
DESCRIPTION
Improving intelligent assistants for desktop activities. Simone Stumpf, Margaret Burnett, Thomas Dietterich Oregon State University School of Electrical Engineering and Computer Science. Overview. Background Activity switching problems How to improve activity prediction - PowerPoint PPT PresentationTRANSCRIPT
Improving intelligent assistants for desktop activities
Simone Stumpf, Margaret Burnett, Thomas Dietterich Oregon State UniversitySchool of Electrical Engineering and Computer Science
2
Overview
Background Activity switching problems How to improve activity prediction
Reducing interruptions Improving accuracy
Conclusion
3
Background: TaskTracer System Intelligent PIM system
The user organizes everyday life into different activities that have a set of resources
e.g., “teach cs534”, “iui-07 paper”, etc.
How it works The user indicates the current activity TaskTracer tracks events (File open, etc.) TaskTracer automatically associates resources
with the current activity TaskTracer provides useful information
finding services through intelligent assistants
5
Activity switching problems
To provide services: Assumes that users switches activity
so data is not too noisy TaskPredictor assists by predicting
activity, based on resource use
AAAI web page
IL local folder
IL netw
IL DOC AAAI PPT
Physical cost (mouseclicks, keypresses)
Cognitive cost (deciding to switch)
6
TaskPredictor Window-document segment (WDS) = unbroken
time period in which a window in focus is showing a single document
Assumptions A prediction is only necessary when the WDS changes A prediction is only made if predictor is confident
enough Shen et al. IUI 2006
Source of features: words in window titles, file pathnames, website URLs, (document content)
Hybrid approach: Naïve Bayes and SVM Accuracy: 80% on 10% coverage
7
Reducing interruptions…
8
Problems in activity prediction
Potential notifications still high Wait to see if user stays on WDS to reduce
number of notifications
Physical cost to interact (mouseclicks, keypresses)
Cognitive cost to interact (deciding to switch)
9
Activity boundaries
Iqbal et al. CHI 2005, 2006 Interruption costs are lower on boundaries Costs high within a unit
So what happens if the user does stay on WDS?
Prepare IL paper
Download latest version
Edit documentSave
document
Upload latest
version
Open document
10
Reducing interruptions
Move from single-window prediction to multiple-window prediction (Shen et al, IJCAI 2007)
Identify user costs to make prediction Determine opportunities intelligently
Trade-off of user cost/benefit Make predictions at boundaries, then
commit changes on user feedback
11
Improving accuracy…
12
Why improve accuracy?
100% accuracy rare TaskPredictor and other predictors may make wrong
predictions Limited feedback – only labels
Users know more – can we harness it?
How can learning systems explain their reasoning to the user?
What is the users’ feedback to the learning system?(Stumpf et al. IUI 2007)
13
Pre-study explanation generation
1
1
1
1
n…
n…Ripper
NB
n
n
…
…Enron farmer-d 122 emails, 4 folders (Bankrupt, Enron News, Personal, Resume)
Rule-based
Keyword-based
Similarity-based
Concrete, and simplified but faithful
15
Rule-based
16
Keyword-based
5 words in email having highest positive weight
5 words in email having most
negative weight
17
Similarity-based
Most decrease if removed from
training set
Up to 5 words in both emails
having highest weights
18
Within-subject study design
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27
Post-block questionnaire
Post-sessionquestionnaire
15 minutes
19
Giving feedback
Participants were asked to provide feedback to improve the predictions
No restrictions on form of feedback
20
Responses to explanations
Negative comments (20%)
…those are arbitrary words.
Confusion (8%)I don’t understand why
there is a second email.
Positive comments (19%)The Resume rules are
good.
Understanding (17%)I see why it used
“Houston” as negative.
Correcting or suggesting changes (32%)Different words could have been found in common, like “Agreement”, “Ken Lay.”
21
Understanding explanations
Rule-based best, then Keyword-based Serious problems with Similarity-based Factors:
General idea of the algorithmI guess it went in here because it was similar to another
email I had already put in that folder. Keyword-based explanations’ negative keyword list
I guess I really don’t understand what it’s doing here. If those words weren’t in the message?
Word choices’ topical appropriateness“Day”, “soon”, and “listed” are incredibly arbitrary
keywords.
22
Preferring explanations Preference trend follows understanding Factors:
Perceived reasoning soundness and accuracyI think this is a really good filter…
Clear communication of reasoningI like this because it shows relationships between other messages in the same folder rather than just spitting out a bunch of rules with no reason behind
it. Informal wording
This is funny... (laughs) ... This seems more personable. Seems like a narration rather than just
straight rules. It’s almost like a conversation.
23
The user explains back Select different features (53%)
It should put email in ‘Enron News’ if it has the keywords “changes” and “policy”.
Adjust weights (12%)The second set of words should be given more
importance. Parse/extract in different way (10%)
I think that it should look for typos in the punctuation for indicators toward ‘Personal’.
Employ feature combinations (5%)I think it would be better if it recognized a last and a first
name together. Use relational features (4%)
This message should be in ‘EnronNews’ since it is from the chairman of the company.
24
Underlying knowledge sources
Commonsense (36%)“Qualifications” would seem like a really good
Resume word, I wonder why that’s not down here.
English (30%)Does the computer know the difference
between “resumé” and “resume”?
Domain (15%)Different words could have been found in
common like … “Ken Lay”.
25
Current work
More than 50% of suggestions could be easily incorporated
New algorithms to handle changes to weights and keywords User feedback as constraints on MLE of the
parameters Co-Training
Investigate effects on accuracy using study data Constraints: Not hurting but not much improvement
either Co-training approach better
26
Conclusion
User costs important Higher accuracy Timing of prediction notifications Usefulness of predictions Explanations of why a prediction was
made