improving intelligent assistants for desktop activities

Improving intelligent assistants for desktop activities

Simone Stumpf, Margaret Burnett, Thomas Dietterich Oregon State UniversitySchool of Electrical Engineering and Computer Science

2

Overview

Background Activity switching problems How to improve activity prediction

Reducing interruptions Improving accuracy

Conclusion

3

Background: TaskTracer System Intelligent PIM system

The user organizes everyday life into different activities that have a set of resources

e.g., “teach cs534”, “iui-07 paper”, etc.

How it works The user indicates the current activity TaskTracer tracks events (File open, etc.) TaskTracer automatically associates resources

with the current activity TaskTracer provides useful information

finding services through intelligent assistants

5

Activity switching problems

To provide services: Assumes that users switches activity

so data is not too noisy TaskPredictor assists by predicting

activity, based on resource use

AAAI web page

IL local folder

IL netw

IL DOC AAAI PPT

Physical cost (mouseclicks, keypresses)

Cognitive cost (deciding to switch)

6

TaskPredictor Window-document segment (WDS) = unbroken

time period in which a window in focus is showing a single document

Assumptions A prediction is only necessary when the WDS changes A prediction is only made if predictor is confident

enough Shen et al. IUI 2006

Source of features: words in window titles, file pathnames, website URLs, (document content)

Hybrid approach: Naïve Bayes and SVM Accuracy: 80% on 10% coverage

7

Reducing interruptions…

8

Problems in activity prediction

Potential notifications still high Wait to see if user stays on WDS to reduce

number of notifications

Physical cost to interact (mouseclicks, keypresses)

Cognitive cost to interact (deciding to switch)

9

Activity boundaries

Iqbal et al. CHI 2005, 2006 Interruption costs are lower on boundaries Costs high within a unit

So what happens if the user does stay on WDS?

Prepare IL paper

Download latest version

Edit documentSave

document

Upload latest

version

Open document

10

Reducing interruptions

Move from single-window prediction to multiple-window prediction (Shen et al, IJCAI 2007)

Identify user costs to make prediction Determine opportunities intelligently

Trade-off of user cost/benefit Make predictions at boundaries, then

commit changes on user feedback

11

Improving accuracy…

12

Why improve accuracy?

100% accuracy rare TaskPredictor and other predictors may make wrong

predictions Limited feedback – only labels

Users know more – can we harness it?

How can learning systems explain their reasoning to the user?

What is the users’ feedback to the learning system?(Stumpf et al. IUI 2007)

13

Pre-study explanation generation

1

1

1

1

n…

n…Ripper

NB

n

n

…

…Enron farmer-d 122 emails, 4 folders (Bankrupt, Enron News, Personal, Resume)

Rule-based

Keyword-based

Similarity-based

Concrete, and simplified but faithful

15

Rule-based

16

Keyword-based

5 words in email having highest positive weight

5 words in email having most

negative weight

17

Similarity-based

Most decrease if removed from

training set

Up to 5 words in both emails

having highest weights

18

Within-subject study design

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18

19 20 21 22 23 24 25 26 27

Post-block questionnaire

Post-sessionquestionnaire

15 minutes

19

Giving feedback

Participants were asked to provide feedback to improve the predictions

No restrictions on form of feedback

20

Responses to explanations

Negative comments (20%)

…those are arbitrary words.

Confusion (8%)I don’t understand why

there is a second email.

Positive comments (19%)The Resume rules are

good.

Understanding (17%)I see why it used

“Houston” as negative.

Correcting or suggesting changes (32%)Different words could have been found in common, like “Agreement”, “Ken Lay.”

21

Understanding explanations

Rule-based best, then Keyword-based Serious problems with Similarity-based Factors:

General idea of the algorithmI guess it went in here because it was similar to another

email I had already put in that folder. Keyword-based explanations’ negative keyword list

I guess I really don’t understand what it’s doing here. If those words weren’t in the message?

Word choices’ topical appropriateness“Day”, “soon”, and “listed” are incredibly arbitrary

keywords.

22

Preferring explanations Preference trend follows understanding Factors:

Perceived reasoning soundness and accuracyI think this is a really good filter…

Clear communication of reasoningI like this because it shows relationships between other messages in the same folder rather than just spitting out a bunch of rules with no reason behind

it. Informal wording

This is funny... (laughs) ... This seems more personable. Seems like a narration rather than just

straight rules. It’s almost like a conversation.

23

The user explains back Select different features (53%)

It should put email in ‘Enron News’ if it has the keywords “changes” and “policy”.

Adjust weights (12%)The second set of words should be given more

importance. Parse/extract in different way (10%)

I think that it should look for typos in the punctuation for indicators toward ‘Personal’.

Employ feature combinations (5%)I think it would be better if it recognized a last and a first

name together. Use relational features (4%)

This message should be in ‘EnronNews’ since it is from the chairman of the company.

24

Underlying knowledge sources

Commonsense (36%)“Qualifications” would seem like a really good

Resume word, I wonder why that’s not down here.

English (30%)Does the computer know the difference

between “resumé” and “resume”?

Domain (15%)Different words could have been found in

common like … “Ken Lay”.

25

Current work

More than 50% of suggestions could be easily incorporated

New algorithms to handle changes to weights and keywords User feedback as constraints on MLE of the

parameters Co-Training

Investigate effects on accuracy using study data Constraints: Not hurting but not much improvement

either Co-training approach better

26

Conclusion

User costs important Higher accuracy Timing of prediction notifications Usefulness of predictions Explanations of why a prediction was

made

improving intelligent assistants for desktop activities

Documents

user costs

user costbenefit

worksthe user

based5 words

singlewindow prediction

words rulebasedkeyword

keypressescognitive

wds changesa prediction