social media analysis 21 november 2019 with nlp michael miller...

88
Social media analysis with NLP Michael Miller Yoder 21 November 2019 1

Upload: others

Post on 24-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Social media analysiswith NLP

Michael Miller Yoder

21 November 2019

1

Page 2: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Overview

1. Motivation: language in social context

2

Page 3: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

3

Page 4: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

Effects of self-presentation on interactionin social media

4

Experiment 1

Page 5: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

Effects of self-presentation on interactionin social media

Portrayal of characters and relationshipsin narrative (fanfiction)

5

Experiment 1

Experiment 2

Page 6: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

language embedded in social context

6

Page 7: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What types of social contexts is language used in?

7

Page 8: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What types of social contexts?

8

Page 9: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

9

Page 10: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

10

Page 11: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

11

Page 12: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

12

Page 13: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

For NLP, what is language?

13

Page 14: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

14

1990 2000 2010

statistical machine learning NLP

Penn Treebank

1987-1989

Page 15: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

15

news

Page 16: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

16

news1987-1989

Page 17: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

17

1990 2000 2010

statistical machine learning NLP neural NLP

Penn Treebank

1987-1989

BERT

Page 18: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

18

Page 19: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

19

Page 20: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

SOCIAL20

language

speakers audience

situations purposes

Page 21: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

21

Penn Treebank

1987-1989

credit: Amir Zeldes, [Zeldes & Simonson 2016]

Typical rates in the secondary market : 8.65 % one month ; 8.65 % three months ; 8.55 % six months. BANKERS ACCEPTANCES : 8.52 % 30 days ; 8.37 % 60 days ; 8.15 % 90 days ; 7.98 % 120 days ; 7.92 % 150 days ; 7.80 % 180 days.

Page 22: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

22

language is always embedded in social context

Page 23: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

23

“Language is by and about people”

—Noah Smith, ACL 2017

https://homes.cs.washington.edu/~nasmith/slides/acl-8-1-17.pdf

Page 24: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

NLP + social science: applications

24

hate speech detection community norms

Page 25: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

NLP + social science: applications

25

fairness and bias

Garg et al. 2017

media framing

https://criticalmediareview.wordpress.com/2015/10/19/what-is-media-framing/

Page 26: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

NLP + social science: applications

26

dialectal NLP tools

Garg et al. 2017www.tes.com

Page 27: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Overview

1. Motivation: language in social context

2. Examples of NLP approaches to modeling identity

Effects of self-presentation on interactionin social media

Portrayal of characters and relationshipsin narrative (fanfiction)

27

Experiment 1

Experiment 2

Page 28: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

28

Page 29: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

29

Page 30: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

30

Page 31: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Models of identity

identity

31

Page 32: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Critical identity approaches

“identity is the product rather than the source of linguistic and other semiotic practices … is social and cultural rather than primarily internal”

sociolinguistics

[Bucholtz and Hall 2005]

32

identity

Page 33: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Critical identity approaches

“identity is the product rather than the source of linguistic and other semiotic practices … is social and cultural rather than primarily internal”

sociolinguistics

[Bucholtz and Hall 2005]

33

identity

society, culture

Page 34: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Critical identity approaches

“As a shifting and contextual phenomenon, gender does not denote a substantive being”

gender studies

[Butler 1990]

34

Page 35: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Critical identity approaches35

(changing) identity

“As a shifting and contextual phenomenon, gender does not denote a substantive being”

gender studies

[Butler 1990]

society, culture

Page 36: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Critical identity approaches

“race and sex become grounded in experiences that actually represent only a subset of a much more complex phenomenon.”

critical race theory

[Crenshaw 1989]

36

(intersectional)identity

Page 37: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Critical identity approaches

“people have multiple identities connected not to their ‘internal states’ but to their performances in society”

discourse analysis

[Gee 2000]

37

identities

Page 38: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Computational identity approaches

“classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language”

computer science

[Rao et al. 2010]

38

identity

Page 39: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Computational identity approaches

“Inferring latent attributes of online users has many applications in public health, politics, and marketing”

computational linguistics

[Ardehaly and Culotta 2015]

39

identity

Page 40: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

“a [deep neural network] can be used to identify sexual orientation from facial images”

computer vision

[Kosinski and Wang 2018]

40

identity

Computational identity approaches

Page 41: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Can we investigate the production of identity in language with computational models?

41

Avoid naturalizing structures of identity and further marginalizing those who don’t fit them (Butler 1990)

Discover how notions of identity are being reinforced/challenged/reinvented

Page 42: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

42

?language + social

data y = f(x)

machine learning

Page 43: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

1. Self-presentation effects on social media

43

Qinlan ShenCMU Language Technologies Institute

Alex CodaCMU Language Technologies Institute

Carolyn P. RoséCMU Language Technologies Institute

Yunseok JangU Michigan Computer Science & Eng

Yale SongMicrosoft Research

Kapil ThadaniYahoo Research

Page 44: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Explicit identity positioning

● Working identity definition: “social positioning of self and other” [Bucholtz & Hall 2010]

● How does the social positioning of self affect interaction on social media?

● Tumblr as a site with particular identity implications, as well as social interaction

44

Page 45: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

45

Page 46: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

46

Lyca / 25

Page 47: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Self-presentation on Tumblr

47

● Explicit social positioning: blog descriptions!

● Well these are messy

● "List descriptions"○ max | 18yo | she/they | girl with dreams | twerfs don't

follow○ andre | 22 | he/him | mexican ✨trans | too many

fandoms ○ hey! annie, she/hers, love me, infj

Page 48: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What effects of similarities and differences in self-positioning do we see on content propagation

in Tumblr?

48

Page 49: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What effects of similarities and differences in self-positioning do we see on content propagation

in Tumblr?

49

blog descriptions reblogging

Page 50: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Reblog prediction

● Reblog "opportunity"

50

follower

followee

post

followee

postsimilar time

Page 51: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Reblog prediction

● Reblog "opportunity"

● Learning to rank pairwise formulation follower

followee

post

51

followee

post

reblog

similar time

Page 52: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Reblog prediction

● Reblog "opportunity"

● Learning to rank pairwise formulation she/her

25 | nyc

post

52

reylo fan

post

reblog

similar time

Page 53: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Levels of identity abstraction

● Identity categories: dimensions of personal characteristics○ age, gender, personality type

● Identity labels:○ 17, trans man, infj

53

Page 54: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Identity category extraction

● Manually grouped most popular common n-grams into 11 categories

● Refined list with manual annotation of 1000 blog descriptions

● Regular expressions to extract features such as "girl", "ravenclaw", "25" to represent users

Identity category

age

ethnicity/nationality

fandoms

gender

interests

location

personality type

pronouns

relationship status

sexual orientation

zodiac sign54

Page 55: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Data

● Sampled 1000 users who have blog descriptions and minimum 10 reblogs

● Pair each reblog with up to 5 posts not reblogged, posted within 30 minutes of the paired reblog

Number of sampled users 1000

Total reblog opportunities 712,670

Timeframe June - Nov 201855

Page 56: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Features● Baseline features:

○ Post hashtags○ Number of likes, reblogs, comments○ Post type (text, photo, quote, video, audio, chat, link, answer)

● Category alignment features:○ Category match○ Category mismatch: one user provides the category, the other does not

● Label alignment features:○ Label match○ Label mismatch○ Specific label interaction count

56

Page 57: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Is there an effect?

57

Page 58: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What is the nature of this effect?

● Generally positive coefficients were learned for category and label match features, negative for mismatches

● Specific interaction features between labels sometimes most informative

58

Page 59: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What is the nature of this effect?

59

Features Likelihood of reblogging

Follower: presents pronounsFollowee: does not

Race/ethnicity label alignment ↑

Nationality label alignment none

Follower: cisgender Followee: cisgender

Page 60: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

What is the nature of this effect?

60

Features Likelihood of reblogging

Similar ages (20 and 21, e.g.) ↑

Follower: animeFollowee: design

Follower: gamingFollowee: manga

Follower: memes Followee: history

Page 61: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Conclusion

● Evidence for an association between explicit, self-presented identity information and content propagation

○ Most studies use only content and network features to predict content propagation [Naveed et al. 2011, Zhang et al. 2016,

Vosoughi et al. 2018]

● Users who presented labels that indicated shared interests or shared values were more likely to share each other’s content

61

Page 62: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

2. Changes in portrayal of characters in narrative

62

Qinlan Shen

Luke Breitfeller

Carolyn P. Rosé

James Fiacco

Shefali GargEthan Xuanyue Yang

Huiming JinHariharan Muralidharan

Page 63: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Motivation

● Examine how others’ identity is positioned in narrative

● Can computational models capture basic changes in narrative portrayal of characters’ identity?

● Fanfiction: fiction created by fans of TV shows, movies, books, comics, etc

63

[Discourse Processes, in submission]

Page 64: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

64

Page 65: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Can we capture changes in character and relationship framing in fanfiction

with word embedding-based methods?

65

Page 66: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

66

● Word embeddings [Mikolov et al. 2013a] for social questions○ Stereotypes and bias in corpora [Garg et al. 2018]

○ Framing by different social groups [An et al. 2018)]

● Can word embeddings capture social framing of relationships in fanfiction?

Methods

Page 67: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

67

1. Focusing on text that is relevant to characterization provides a stronger signal for learning shifts in relationship portrayal

2. Differences between canon and fanfiction vector representations in embedding space can represent changes in relationship portrayal

Hypotheses

Page 68: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Data

68

Harry Potter stories Archive of Our Own

>179k stories (as of 2018)

Characters

● Harry Potter● Hermione

Granger● Draco Malfoy● Ron Weasley● Ginny

Weasley

Pairings by popularity

● Draco/Harry● Hermione/Ron● Draco/Hermione● Ginny/Harry● Harry/Hermione● Harry/Ron

Page 69: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Prediction task

69

● Does the relationship match canon in being romantic/not romantic?

● True if

○ romantic in canon and romantic in fanfiction or

○ not romantic in canon and not romantic in fanfiction

Page 70: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Text extraction

70

github.com/michaelmilleryoder/fanfiction-nlpBased on BookNLP [Bamman et al. 2014]

Page 71: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Relationship representations

71

Harry wept at the sight of Hermione in the garden.

Ron looked down at his shoe. Troll bogeys. He would have to tell Harry about this.

Harry Hermione Harry Ron

● Weighted average of word embeddings in a 10-word window around character name mentions

Page 72: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

72

Page 73: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Visualization

● Track changes in contextualized embeddings for character names across fics

○ Train RNN-based language model and take final hidden state as contextualized word representation [Peters et al. 2018]

73

Page 74: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Visualization

Hermione sat in the front of the classroom. She...

Fleur whistled softly. "Hermione! Come here...

[ 0.34 0.72 0.21 … ]

[ 0.89 0.06 0.53 … ]

74

Page 75: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

75

Page 76: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

76

Canon vector is close to the center of the fanfiction vectors: harry

Canon vector is on the edge of fanfiction vectors: draco, remus, sirius

Page 77: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Conclusion

77

● Word embedding approaches can capture types of character framing

○ See evidence of differences in characterization, relationships

● Differences often match known fanfiction trends

Page 78: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Conclusion

78

Page 79: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Computational models of identity in language

● Assumption: identity is not only reflected, but also constructed, in language

● Computational techniques to analyze and model the presentation of identity in discourse

79

Page 80: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Computational models of identity in language

● Shift focus from predicting latent user attributes from language to exploring how people are positioning themselves and others in language

● Enables exploring the effects of the choice of self-presentation (Experiment 1)

● Acknowledges that identities can be framed and represented in varied, changing ways in narrative (Experiment 2) 80

Page 81: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

language embedded in social context

81

Page 82: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Thank you!

82

Page 83: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

draco canon vector is on the edge of fanfiction vectors

83

Representation for Ron

Page 84: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

draco canon vector is on the edge of fanfiction vectors

84

Representation for Ron

Differences when cast in a canon relationship vs. when excluded

Page 85: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Data

85

● For each character pairing, sampled stories with at least 5 paragraphs with both characters mentioned

● Balanced dataset across 6 pairings

● Each instance is a particular pairing in a story

Page 86: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Interaction on Tumblr

● How does the social positioning of self affect interaction on social media?

● Primary form of interaction on Tumblr: "reblogging" [Xu et al.

2014]

● Reblogging as content propagation; most studies use only content and network features to predict content propagation [Naveed et al. 2011, Zhang et al. 2016, Vosoughi et al. 2018]

86

Page 87: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Identity category annotation

87

Page 88: Social media analysis 21 November 2019 with NLP Michael Miller Yoderdemo.clab.cs.cmu.edu/NLP/F19/files/slides/Yoder_social... · 2019-11-22 · with NLP Michael Miller Yoder 21 November

Prediction tasks

88

● Canon: does the relationship match canon in being romantic/not romantic?

● Auxiliary tasks to test if simply capturing something else

○ Romantic: is the relationship romantic?

○ M/M: is the relationship between 2 males? (Regardless of whether it's romantic.)