personality driven differences in paraphrase preferencedanielpr/files/perspara17nlpcss_pres.pdf ·...

15
Personality Driven Dierences in Paraphrase Preference Daniel Preot ¸iuc-Pietro Bloomberg LP Joint work with Jordan Carpenter (Kenan Institute for Ethics, Duke) Lyle Ungar (Computer & Information Science, UPenn) while at the University of Pennsylvania 3 August 2017

Upload: others

Post on 08-Nov-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Personality Driven Differences in ParaphrasePreference

Daniel Preotiuc-PietroBloomberg LP

Joint work withJordan Carpenter (Kenan Institute for Ethics, Duke)

Lyle Ungar (Computer & Information Science, UPenn)while at the University of Pennsylvania

3 August 2017

Page 2: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Motivation

User attribute prediction from text is successful:

I Age (Rao et al. 2010 ACL)

I Gender (Burger et al. 2011 EMNLP)

I Location (Eisenstein et al. 2010 EMNLP)

I Personality (Schwartz et al. 2013 PLoS One)

I Impact (Lampos et al. 2014 EACL)

I Political Orientation (Volkova et al. 2014 ACL)

I Mental Illness (Coppersmith et al. 2014 ACL)

I Occupation (Preotiuc-Pietro et al. 2015 ACL)

I Income (Preotiuc-Pietro et al. 2015 PLoS One)

Page 3: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

However...

Most text prediction methods uncover topical differences

relative frequency

a aacorrelation strength

Openness to Experience

Page 4: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

However...

Most text prediction methods uncover topical differences

relative frequency

a aacorrelation strength

Extraversion

Page 5: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Stylistic differences

We need to be aware of style differences, rather than topical

Not useful for many practical applications that adapt to traits:

I machine translation (Mirkin et al. 2015 EMNLP, Rabinovich et al 2017 EACL)

I agents (e.g. customer service, tutoring)I controlling for gender or racial bias

Page 6: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Stylistic differences

One type of stylistic difference is phrase choice in context.

SplendidExcellent

Remarkable

Openness

MagnificentFabulous

Tremendous

Extraversion

Page 7: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Data

We study the Big Five personality traits:

I 115,312 Facebook usersI Personality scores obtained through the MyPersonality

app (Kosinski et al, 2013)

I For each trait, take top and bottom 20% of users

Page 8: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Paraphrasing

Paraphrases – alternative ways to convey the same information

Paraphrase Database (PPDB) 2.0 (Pavlick et al. 2015 ACL):

I annotated with type and confidence (filter ‘equivalent’paraphrases with >.2 confidence)

I >6M automatically derived paraphrase pairsI we use only 1–3 gramsI difference in a pair more than just change of stopwords or

root form of word

Page 9: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Prediction

.603

.551

.519

.551 .549

.573

.589.578

.553

.590

.623

.639

.597 .593

.631

0.50

0.55

0.60

0.65

Openness Conscientiousness Extraversion Agreeableness Neuroticism

Paraphrases only Phrases w/o paraphrases All Phrases

Accuracy, Naive Bayes, 90-10 training-testing, balanced data

Page 10: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Quantifying Preference

Straightforward measure:

Extraversion(w) = log(

Extravert(w)Introvert(w)

)(1)

Within a paraphrase pair (w1,w2), the differenceExtraversion(w1) − Extraversion(w2) is the stylistic distance.

Used previously to study paraphrase preference across age,gender and occupational class (Preotiuc-Pietro, Xu & Ungar, AAAI 2016).

Page 11: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Linguistic Theories

Study which attributes of words in a pair are preferred by onegroup:

I Word Length in CharactersI Word Length in Syllables

Simple proxies for word complexity

I Affective Norms: Valence, Arousal, Dominance14k rated wordsValence: suicide (0.15)→ bacon (0.70)→ laughter (1)

I Concreteness40k rated words: spirituality (1)→morning (3.44)→ tiger (5)

I Age of Acquisition30k rated words: great (5.05)→ splendid (7.22)→ tremendous (10.63)

I More in the paper ...

Page 12: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Linguistic Theories

.163

-.068

-.043

-.012

-.041

.067

.182

-.002

-.014

.036

-.001

.050

.045

.097

-.060

.010

.031

.028

.050

.047

.080

-.032

-.007

.030

.005

.040

.016

.010

-.014

.023

.000

-.024

.004

-.020

-.065

-.200 -.150 -.100 -.050 .000 .050 .100 .150 .200

Age of Acquisition

Concreteness

Dominance

Arousal

Happiness

#Syllables

Word Length

Openess Conscientiousness Extraversion Agreeableness Neuroticism

Correlation coefficients between paraphrase pair preferenceand user group usage.

Page 13: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Linguistic Theories

.163

-.068

-.043

-.012

-.041

.067

.182

-.002

-.014

.036

-.001

.050

.045

.097

-.060

.010

.031

.028

.050

.047

.080

-.032

-.007

.030

.005

.040

.016

.010

-.014

.023

.000

-.024

.004

-.020

-.065

-.200 -.150 -.100 -.050 .000 .050 .100 .150 .200

Age of Acquisition

Concreteness

Dominance

Arousal

Happiness

#Syllables

Word Length

Openess Conscientiousness Extraversion Agreeableness Neuroticism

Correlation coefficients between paraphrase pair preferenceand user group usage.

Page 14: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Take Aways

I Stylistic difference between user groups have importantapplicability

I Paraphrase choice contains valuable informationI Shed light on psycholinguistic theoriesI Potential way to generate text perceived to be from a

different user trait

See our EMNLP 2017 paper (Preotiuc-Pietro, Guntuku, Ungar - ControllingHuman Perception of Basic User Traits)

Page 15: Personality Driven Differences in Paraphrase Preferencedanielpr/files/perspara17nlpcss_pres.pdf · Personality Driven Di erences in Paraphrase Preference Daniel Preot¸iuc-Pietro

Thank you!

Questions?