1 modeling intention in email vitor r. carvalho ph.d. thesis defensethesis committee: language...

64
1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis Defense Thesis Committee: Language Technologies Institute William W. Cohen (chair) School of Computer Science Tom M. Mitchel Carnegie Mellon University Robert E. Kraut July 22 th 2008 Lise Getoor (Univ. Maryland)

Upload: irene-byrd

Post on 18-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

3 Why  The most successful e-communication application.  Great tool to collaborate, especially in different time zones.  Very cheap, fast, and convenient.  Multiple uses: task manager, contact manager, doc archive, to-do list, etc.  Increasingly popular  Clinton adm. left 32 million s to the National Archives  Bush adm….more than 100 million in 2009 (expected)  Visible impact  Office workers in the U.S. spend at least 25% of the day on – not counting handheld use [Shipley & Schwalbe, 2007]

TRANSCRIPT

Page 1: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

1

Modeling Intention in Email

Vitor R. Carvalho

Ph.D. Thesis Defense Thesis Committee:Language Technologies Institute William W. Cohen (chair)School of Computer Science Tom M. Mitchel Carnegie Mellon University Robert E. KrautJuly 22th 2008 Lise Getoor (Univ. Maryland)

Page 2: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

2

Outline1. Motivation

2. Email Acts

3. Email Leaks

4. Recommending Email Recipients

5. Learning Robust Rank Models

6. User Study

Page 3: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

3

Why Email

The most successful e-communication application. Great tool to collaborate, especially in different time zones. Very cheap, fast, and convenient. Multiple uses: task manager, contact manager, doc archive, to-do list, etc.

Increasingly popular Clinton adm. left 32 million emails to the National Archives Bush adm….more than 100 million in 2009 (expected)

Visible impact Office workers in the U.S. spend at least 25% of the day on email – not

counting handheld use

[Shipley & Schwalbe, 2007]

[Shipley & Schwalbe, 2007]

Page 4: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

4

Hard to manage

People get overwhelmed Costly interruptions Serious impacts on work productivity Increasingly difficult to manage requests, negotiate shared tasks and

keep track of different commitments

People make horrible mistakes Send messages to the wrong persons Forget to address intended recipients “Oops, Did I just hit reply-to-all?”

[Dabbish & Kraut, CSCW-2006].[Belloti et al. HCI-2005]

Page 5: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

5

Thesis

►We present evidence that email management can be potentially improved by the effective use of machine learning techniques to model different aspects of user intention.

Page 6: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

6

Outline1. Motivation

2. Email Acts ◄

3. Preventing Email Information Leaks

4. Recommending Email Recipients

5. Learning Robust Rank Models

6. User Study

Page 7: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

7

Classifying Email into Acts [Cohen, Carvalho & Mitchell, EMNLP-04][Cohen, Carvalho & Mitchell, EMNLP-04]

Verb

Commisive Directive

Deliver Commit Request Propose

Amend

Noun

Activity

OngoingEvent

MeetingOther

Delivery

Opinion Data

Verb

Commisive Directive

Deliver Commit Request Propose

Amend

Noun

Activity

OngoingEvent

MeetingOther

Delivery

Opinion Data

Nouns

Verbs An Act is described as a verb-

noun pair (e.g., propose meeting, request information) - Not all pairs make sense

One single email message may contain multiple acts

Try to describe commonly observed behaviors, rather than all possible speech acts in English

Also include non-linguistic usage of email (e.g. delivery of files)

Page 8: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

8

Data & Features Data: Carnegie Mellon MBA students competition

Semester-long project for CMU MBA students. Total of 277 students, divided in 50 teams (4 to 6 students/team). Rich in task negotiation.

1700+ messages (from 5 teams) were manually labeled. One of the teams was double labeled, and the inter-annotator agreement ranges from 0.72 to 0.83 (Kappa) for the most frequent acts.

Features N-grams: 1-gram, 2-gram, 3-gram,4-gram and 5-gram Pre-Processing

Remove Signature files, quoted lines (in-reply-to) [Jangada package] Entity normalization and substitution patterns:

“Sunday”…”Monday” →[day], [number]:[number] → [hour], “me, her, him ,us or them” → [me], “after, before, or during” →

[time], etc

Page 9: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

9

Classification Performance

5-fold cross-validation over 1716 emails, SVM with linear kernel

[Carvalho & Cohen, HLT-ACTS-06][Cohen, Carvalho & Mitchell, EMNLP-04]

Page 10: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

10

Predicting Acts from Surrounding Acts

DeliverRequest

CommitProposeRequest

Commit

Deliver

Commit

Deliver Strong correlation between previous and next message’s acts

Example of Email Thread Sequence

Both Context and Content have predictive value for email act classification

[Carvalho & Cohen, SIGIR-05]

Collective classification problem → Dependency Network

Page 11: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

11

Collective Classification with Dependency Networks (DN)

In DNs, the full joint probability distribution is approximated with a set of conditional distributions that can be learned independently. The conditional probabilities are calculated for each node given its Markov blanket.

))(|Pr()Pr( i

ii XBlanketXX Parent

MessageChild Message

Current

Message

Request

Commit

Deliver

… ……

[Heckerman et al., JMLR-00][Neville & Jensen, JMLR-07] Inference: Temperature-driven Gibbs

sampling

[Carvalho & Cohen, SIGIR-05]

Page 12: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

12

Act by Act Comparative Results

37.66

30.74

47.81

58.27

47.25

36.84

42.01

44.98

42.55

32.77

52.42

58.37

49.55

40.72

38.69

43.44

0 10 20 30 40 50 60 70

Commissive

Commit

Meeting

Directive

Request

Propose

Deliver

dData

Kappa Values (%)

Baseline CollectiveModest improvements over the baseline

Only on acts related to negotiation: Request, Commit, Propose, Meet, Commissive, etc.

Kappa values with and without collective classification, averaged over four team test sets in the leave-one-team out experiment.

Page 13: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

13

Key Ideas Summary

Introduced a new taxonomy of acts tailored to email communication Good levels of inter-annotator agreement Showed that it can be automated Proposed a collective classification algorithm for threaded messages

Related Work: Speech Act Theory [Austin, 1962;Searle,1969], Coordinator system

[Winograd,1987], Dialog Acts for Speech Recognition, Machine Translation, and other dialog-based systems. [Stolcke et al., 2000] [Levin et al., 03], etc.

Related applications: Focus message in threads/discussions [Feng et al, 2006], Action-items

discovery [Bennett & Carbonell, 2005], Task-focused email summary [Corsten-Oliver et al, 2004], Predicting Social Roles [Leusky, 2004], etc.

Page 14: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

14

Applications of Email Acts Iterative Learning of Email Tasks and Email Acts

Predicting Social Roles and Group Leadership

Detecting Focus on Threaded Discussions

Semantically Enhanced Email

Email Act Taxonomy Refinements

[Kushmerick & Khousainov, IJCAI-05]

[Leusky,SIGIR-04][Carvalho et al, CEAS-07]

[Feng et al., HLT/NAACL-06]

[Scerri et al, DEXA-07]

[Lampert et al, AAAI-2008 EMAIL]

Page 15: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

15

Outline

1. Motivation

2. Email Acts

3. Preventing Email Information Leaks ◄

4. Recommending Email Recipients ◄

5. Learning Robust Rank Models

6. User Study

Page 16: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

16

Page 17: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

17

Page 18: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

18

http://www.sophos.com/

Page 19: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

19

Preventing Email Info Leaks

1. Similar first or last names, aliases, etc

2. Aggressive auto-completion of email addresses

3. Typos

4. Keyboard settings

Disastrous consequences: expensive law suits, brand reputation damage, negotiation setbacks, etc.

Email Leak: email accidentally sent to wrong person

[Carvalho & Cohen, SDM-07]

Page 20: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

20

Preventing Email Info Leaks

1. Similar first or last names, aliases, etc

2. Aggressive auto-completion of email addresses

3. Typos

4. Keyboard settings

[Carvalho & Cohen, SDM-07] Method

1. Create simulated/artificial email recipients

2. Build model for (msg.recipients): train classifier on real data to detect synthetically created outliers (added to the true recipient list).

Features: textual(subject, body), network features (frequencies, co-occurrences, etc).

3. Detect potential outliers - Detect outlier and warn user based on confidence.

Page 21: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

21

Simulating Email Leaks

Else: Randomly select an address book entry

321

Marina.wang @enron.comGenerate a random email address NOT in Address Book

Several options: Frequent typos, same/similar last names, identical/similar first

names, aggressive auto-completion of addresses, etc.

We adopted the 3g-address criteria: On each trial, one of the msg recipients is randomly chosen

and an outlier is generated according to:

Page 22: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

22

Data and Baselines

Enron email dataset, with a realistic setting For each user, ~10% most

recent sent messages were used as test

Some basic preprocessing

Baseline methods: Textual similarity Common baselines in IR

Rocchio/TFIDF Centroid [1971]Create a “TfIdf centroid” for each user in Address Book. For testing, rank according to cosine similarity between test msg and each centroid.

Knn-30 [Yang & Chute, 1994]Given a test msg, get 30 most similar msgs in training set. Rank according to “sum of similarities” of a given user on the 30-msg set.

Page 23: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

23

Using Non-Textual Features

1. Frequency features Number of received, sent and

sent+received messages (from this user)

2. Co-Occurrence Features Number of times a user co-occurred

with all other recipients.

3. Auto features For each recipient R, find Rm

(=address with max score from 3g-address list of R), then use score(R)-score(Rm) as feature.

Combine with text-only scores using perceptron-based reranking, trained on simulated leaks

Non-textual Features

Text-based Feature (KNN30 score or TFIDF score)

Page 24: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

24

Email Leak Results[Carvalho & Cohen, SDM-07]

0.406

0.558 0.56

0.804

0.7480.718

0.814

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Leak Prediction

Acc

urac

y (P

rec@

1) .

RandomTfIdfKnn30Knn30+FrequencyKnn30+CooccurKnn30+AutoKnn30+All_Above

Page 25: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

25

Finding Real Leaks in Enron How can we find it?

Grep for “mistake”, “sorry” or “accident” Note: must be from one of the Enron users

Found 2 valid cases:1. Message germany-c/sent/930, message has 20 recipients, leak is

alex.perkins@2. kitchen-l/sent items/497, it has 44 recipients, leak is rita.wynne@

Prediction results: The proposed algorithm was able to find these two leaks

“Sorry. Sent this to you by mistake.”,“I accidentally sent you this reminder”

Page 26: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

26

Another Email Addressing Problem

Sometimes people just forget an intended recipient

Page 27: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

27

Forgetting an intended recipient Particularly in large organizations,

it is not uncommon to forget to CC an important collaborator: a manager, a colleague, a contractor, an intern, etc.

More frequent than expected (from Enron Collection) at least 9.27% of the users have forgotten to add a desired email

recipient. At least 20.52% of the users were not included as recipients (even

though they were intended recipients) in at least one received message.

Cost of errors in task management can be high: Communication delays, deadlines can be missed Opportunities wasted, costly misunderstandings, task delays

[Carvalho & Cohen, ECIR-2008]

Page 28: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

28

Data and Features Easy to obtain labeled data

Two Ranking problems Predicting TO+CC+BCC Predicting CC+BCC

Features & Methods Textual: Rocchio (TFIDF) and KNN Non-Textual: Frequency, Recency and Co-Occurrence

Number of messages received and/or sent (from/to this user) How often was a particular user addressed in the last 100 msgs Number of times a user co-occurred with all other recipients. Co-occurr

means “two recipients were addressed in the same message in the training set”

Page 29: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

29

Email Recipient Recommendation

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

TOCCBCC CCBCC

MA

P

FrequencyRecencyTFIDFKNNPerceptron

36 Enron users

44000+ queriesAvg: ~1267 q/user

[Carvalho & Cohen, ECIR-08]

MRR ~ 0.5

Page 30: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

30

Rank Aggregation

Many “Data Fusion” methods 2 types:

Normalized scores: CombSUM, CombMNZ, etc. Unnormalized scores: BordaCount, Reciprocal Rank Sum, etc.

Reciprocal Rank: The sum of the inverse of the rank of document in each ranking.

Rankingsq iq

i drankdRR

)(1)(

[Aslam & Montague, 2001]; [Ogilvie & Callan, 2003]; [Macdonald & Ounis, 2006]

Page 31: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

31

Rank Aggregation Results

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

TOCCBCC CCBCC

MA

P

FrequencyRecencyTFIDFKNNPerceptronFusion (MRR)

Page 32: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

32

Intelligent Email Auto-completionTOCCBCC

CCBCC

[Carvalho & Cohen, ECIR-08]

Page 33: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

33

Related Work Email Leak

[Boufaden et al., 2005]: proposed a privacy enforcement system to monitor specific privacy breaches (student names, student grades, IDs).

[Lieberman and Miller, 2007]: Prevent leaks based on faces

Recipient Recommendation [Pal & McCallum, 2006], [Dredze et al, 2008]: CC Prediction problem,

Recipient prediction based on summary keywords

Expert Search in Email [Dom et al.,2003], [Campbell et al,2003], [Balog & de Rijke, 2006],

[Balog et al, 2006],[Soboroff, Craswell, de Vries (TREC-Enterprise 2005-06-07…)]

Page 34: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

34

Outline

1. Motivation

2. Email Acts

3. Preventing Email Information Leaks

4. Recommending Email Recipients

5. Learning Robust Ranking Models ◄

6. User Study

Page 35: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

35

Can we learn a better ranking function? Learning to Rank: machine learning to improve ranking

Many recently proposed methods: RankSVM RankBoost Committee of Perceptrons

Meta-Learning Method Learn Robust Ranking Models in the pairwise-based framework

[Elsas, Carvalho & Carbonell, WSDM-08]

[Freund et al, 2003][Joachims, KDD-02]

Page 36: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

36

Pairwise-based Ranking

)()( jiji dfdfdd

mimiiii xwxwxwddf ...,w)( 2211

Rank q

d1

d2

d3

d4

d5

d6

...

dT

We assume a linear function f

0, jiji ddwdd

Goal: induce a ranking function f(d) s.t.

Constraints:

),...,,( 62616 mxxx

Paired instance

O(n) mislabels produce O(n^2) mislabeled pairs

Page 37: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

37

Effect of Pairwise Outiers

RankSVM

SEAL-1

Page 38: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

38

Effect of Pairwise Outiers

-0.5

0

0.5

1

1.5

2

-3 -2 -1 0 1 2 3

LossLoss Function

RankSVM

RankSVM

Pairwise Score Pl

Page 39: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

39

Effect of Pairwise Outiers

RankSVM

-0.5

0

0.5

1

1.5

2

-3 -2 -1 0 1 2 3

Loss

RankSVM

Loss Function

Robust to outliers, but not convexPairwise Score Pl

Page 40: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

40

Ranking Models: 2 Stages

Base Ranker

Sigmoid Rank

Non-convex:

Final model

Base ranking model

e.g., RankSVM, Perceptron, ListNet,

etc.

Minimize (a very close approximation for) the empirical error: number of misranks

Robust to outliers (label noise)

Page 41: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

41

Learning

SigmoidRank Loss

Learning with Gradient Descent

RP

NRRkSigmoidRanwddwsigmoidwL )],(1[min 2

xexsigmoid

1

1)(

)( )()(

)()()1(

kdrankSigmoi

k

kk

kk

wLw

www

)],(1)[,( 2)( )(NRRNRR

RP

kdrankSigmoi ddwsigmoidddwsigmoidwwL

Page 42: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

43

Email Recipient Results[Carvalho, Elsas, Cohen and Carbonell, SIGIR 2008 LR4IR]

0.3

0.35

0.4

0.45

0.5

0.55

TOCCBCC

MAP

Frequency

Recency

TFIDF

KNN

Percep

MRR (Fusion)

RankSVM

RankSVM+Sigmoid

Percep+Sigmoid

Page 43: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

44

Email Recipient Results[Carvalho, Elsas, Cohen and Carbonell, SIGIR 2008 LR4IR]

0.3

0.35

0.4

0.45

0.5

0.55

CCBCC

MAP

Frequency

Recency

TFIDF

KNN

Percep

MRR (Fusion)

RankSVM

RankSVM+Sigmoid

Percep+Sigmoid

Page 44: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

45

Set Expansion (SEAL) Results

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

SEAL-1 SEAL-2 SEAL-3

MA

P

PercepPercep+SigmoidRankSVMRankSVM+Sigmoid

[Wang & Cohen, ICDM-2007]

[18 features, ~120/60 train/test splits, ~half relevant]

[Carvalho et al, SIGIR 2008 LR4IR]

Page 45: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

46

Letor Results

[#queries/#features: (106/25) (50/44) (75/44)]

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Ohsumed Trec3 Trec4

MA

P

PercepPercep+SigmoidRankSVMRankSVM+Sigmoid

[Carvalho et al, SIGIR 2008 LR4IR]

Page 46: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

47

Related Work

Classification with non-convex loss functions: tradeoff for outlier robustness, accuracy, scalability, etc.

[Perez-Cruz et al, 2003], [Xu et al., 2006], [Zhan & Shen, 2005], [Collobert et al, 2006], [Liu et al, 2005], [Yang and Hu, 2008]

Ranking with other non-convex loss functions FRank [Tsai et al, 2007]: a fidelity-based loss function optimized in the

boosting framework, query normalization may be interfering in performance gains, not a general stage (meta) learner

Page 47: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

48

Outline

1. Motivation

2. Email Acts

3. Preventing Email Information Leaks

4. Recommending Email Recipients

5. Learning Robust Ranking Models

6. User Study ◄

Page 48: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

49

User Study

Choosing an Email System:

Gmail, Yahoo!Mail, etc. Widely adopted, interface/compatibility issues

Develop a new client Perfect control, longer development, low adoption.

Mozilla Thunderbird Open source community, easy mechanism to install

extensions, millions of users.

Page 49: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

50

User Study: Cut Once

Cut Once, a Mozilla Thunderbird extension for Leak Detection and Recipient Recommendation

A few issues … Poor documentation and limited customization of interface JavaScript is slow: Imposed computational restrictions

Disregard rare words and rare users. Implement two ‘lightweight’ ranking methods:

1) TFIDF 2) MRR (Frequency, Recency, TFIDF)

[Balasubramanyan, Carvalho and Cohen, AAAI-2008 EMAIL]

Page 50: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

51

Cut Once Screenshots

Main Window after installation

Page 51: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

52

Cut Once Screenshots

Page 52: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

53

Cut Once ScreenshotsLogged: Cut Once Usage

- Time, Confidence and Position in rank of clicked recommendations

- Baseline Ranking Method

Page 53: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

54

User Study Description

4 week long study most subjects from Pittsburgh After 1 week, qualified users were invited to continue. 20% of

compensation was paid after 1 week After 4 weeks, users were fully compensated (+ final questionnaire)

26 subjects finished study 4 female and 22 male. Median age 28.5. Total of 2315 sent messages. Averages: 113 address book entries Mostly students. A few sys admin, 1 professor, 2 staff member.

Randomly assigned to two different ranking methods: TFIDF and MRR

Page 54: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

55

Recipient Suggestions 17 subjects used the functionality (in 5.28% of their sent msgs).

Average of 1 accepted suggestion per 24.37 sent messages.

Page 55: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

56

Comparison of Ranking Methods

Difference is not statistically significant

MRR better than TFIDF

Clicked Rank

Average Rank: 3.14 versus 3.69

Rank Quality: 3.51 versus 3.43

Rough estimate: factor of 5.5

5.5*4 = 22 weeks of user study or

5.5*26 = 143 subjects for 4 weeks

Page 56: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

57

Results: Leak Detection

18 subj used the leak deletion (in 2.75% of their sent msgs).

Most frequent reported use was to “clean up” the addressee list: Removing unwanted people (inserted by Reply-all) Remove themselves (automatically added)

5 real leaks were reported, from 4 different users

These users did not use Cut Once to remove the leaks Clicked on the “cancel” button, and removed manually

Uncomfortable or unfamiliar with interface Under pressure because of 10-sec timer

Page 57: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

58

Results: Leak Detection 5 leaks → 4 users.

Network Admin: two users with similar userIDs System Admin: wrong auto-completion in 2 or 3 situations Undergrad: two acquaintances with similar names Grad student: reply-all case

Correlations 2 users used TFIDF, 2 MRR No significant correlation with size of Address Book or #sent msgs Correlation with “non-student” occupations (95% confidence)

Estimate: 1 leak every 463 sent messages Assuming a binomial dist with p=5/2315, then 1066 messages are

required send at least one leak (with 90% confidence).

Page 58: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

59

Final Questionnaire(Higher = Better)

Not

Page 59: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

60

Frequent complaints

Training and Optimization

Interface

Page 60: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

61

Final QuestionnaireCompose-then-address instead of

address-then-compose behavior

Page 61: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

62

Conclusions Email acts

A taxonomy of intentions in email communication Categorization can be automated

Addressed Two Types of Email Addressing Mistakes Email Leaks (accidentally adding non-intended recipients) Recipient recommendation (forgetting intended recipients)

Framed as a supervised learning problems Introduced new methods for Leak detection Proposed several models for recipient recommendation, including

combinations of base methods.

Proposed a new general purpose ranking algorithm Robust to outliers - outperformed state-of-the-art rankers on the

recipient recommendation task (and other ranking tasks)

User Study using a Mozilla Thunderbird extension Caught 5 real leaks, and showed reasonably good prediction quality Showed clear potential to be adopted by a large number of email users

Page 62: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

63

Proof: non-existence of a better advisor

Given the finite set of good advisors An

Page 63: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

64

Proof: non-existence of a better advisor

Q.E.D.

Assuming

Given the finite set of advisors An

Page 64: 1 Modeling Intention in Email Vitor R. Carvalho Ph.D. Thesis DefenseThesis Committee: Language Technologies Institute William W. Cohen (chair) School of

65

Thank you.