learning from crowds in the presence of schools of thought · 1 learning from crowds in the...

40
1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian 1 and Jun Zhu 2 1 Carnegie Mellon University 2 Tsinghua University

Upload: others

Post on 27-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

1

Learning from Crowds in the Presence of Schools of Thought

Yuandong Tian1 and Jun Zhu2

1Carnegie Mellon University2Tsinghua University

Page 2: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

2

Crowd-sourcing

Worker 1 Worker 2 Worker 3 Worker 4Task 1 x x x

Task 2 x x

Task 3 x x x

Page 3: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

3

Crowd-sourcing

Objective Tasks Subjective Tasks

E.g. Demographical SurveyPersonal OpinionsCreative thoughtsIll-designed ambiguous tasks.

E.g. Labeling datasetKnowledge Test

Page 4: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

4

Crowd-sourcing

Objective Tasks Subjective Tasks

Noise

Page 5: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

5

Crowd-sourcing

Objective Tasks Subjective Tasks

Task clarityWorker reliability

Noise

Page 6: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

6

Previous works

Objective Tasks Subjective Tasks

Majority Voting[J. Whitehill et al., NIPS’09][V.C. Raykar et al., JMLR’10][P. Welinder et al., NIPS’10]…..

Gold standard

Worker Reliability

Page 7: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

7

Our Contribution

Objective Tasks Subjective Tasks

Contributions:1. Applicable to both objective and subjective tasks.2. Simple , no iterative procedure, no initial guess.

Page 8: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

8

Two Principles

A worker is reliable if he agrees with other workers in many tasks.

A task is clearif it has only a few answers.

Page 9: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

9

Clustering AnalysisTask k

A B101111010

C D E F G H L

Workers

Page 10: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

Group-size Matrix #Z

A DE L

G

B CF

H

Task kWorker Assign. Cluster size

A I 5

B II 3

C II 3

D I 5

E I 5

F II 3

G I 5

H III 1

L I 5

I

II

III

Page 11: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

12

Group-size Matrix #Z#Z

Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7Worker A 5 3 2 3 4 2 6Worker B 3 3 4 5 4 3 6Worker C 3 2 2 5 2 4 6Worker D 5 3 4 5 4 4 6Worker E 5 2 2 5 2 3 2Worker F 3 2 2 5 2 4 2Worker G 5 2 4 3 1 3 6Worker H 1 1 1 1 2 2 1Worker L 5 1 4 3 4 4 6

Page 12: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7Worker A 5 3 2 3 4 2 6Worker B 3 3 4 5 4 3 6Worker C 3 2 2 5 2 4 6Worker D 5 3 4 5 4 4 6Worker E 5 2 2 5 2 3 2Worker F 3 2 2 5 2 4 2Worker G 5 2 4 3 1 3 6Worker H 1 1 1 1 2 2 1Worker L 5 1 4 3 4 4 6

13

Worker Reliability

Page 13: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

14

Task Clarity

Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 7Worker A 5 3 2 3 4 2 6Worker B 3 3 4 5 4 3 6Worker C 3 2 2 5 2 4 6Worker D 5 3 4 5 4 4 6Worker E 5 2 2 5 2 3 2Worker F 3 2 2 5 2 4 2Worker G 5 2 4 3 1 3 6Worker H 1 1 1 1 2 2 1Worker L 5 1 4 3 4 4 6

Page 14: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

18

Factorization

=Worker

Reliability

Task clarity

#Z > 0 λ > 0 and μ > 0

T 1 T 2 T 3 T4 T5 T6 T 7

WA 5 3 2 3 4 2 6

WB 3 3 4 5 4 3 6

WC 3 2 2 5 2 4 6

WD 5 3 4 5 4 4 6

WE 5 2 2 5 2 3 2

WF 3 2 2 5 2 4 2

WG 5 2 4 3 1 3 6

WH 1 1 1 1 2 2 1

WL 5 1 4 3 4 4 6

#Z

Perron-Frobenius theorem:

Page 15: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

19

Clustering Model

Task k

Page 16: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

20

Clustering Model

Task k

N

M

Page 17: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

N

M

21

Clustering Model

Task k

answers

cluster centers

cluster labels

Page 18: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

22

Clustering Model

Task k

N

M

Page 19: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

24

Clustering Model

Label assignment

Clustering Model #Z

A DE L

G

B CF

H

T 1 T 2 T 3 T4 T5 T6 T 7

W1 5 3 2 3 4 2 6

W2 3 3 4 5 4 3 6

W3 3 2 2 5 2 4 6

W4 5 3 4 5 4 4 6

W5 5 2 2 5 2 3 2

W6 3 2 2 5 2 4 2

W7 5 2 4 3 1 3 6

W8 1 1 1 1 2 2 1

W9 5 1 4 3 4 4 6

Page 20: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

25

Clustering Model

Label assignment

Clustering Model #Z

T 1 T 2 T 3 T4 T5 T6 T 7

W1 5 3 2 3 4 2 6

W2 3 3 4 5 4 3 6

W3 3 2 2 5 2 4 6

W4 5 3 4 5 4 4 6

W5 5 2 2 5 2 3 2

W6 3 2 2 5 2 4 2

W7 5 2 4 3 1 3 6

W8 1 1 1 1 2 2 1

W9 5 1 4 3 4 4 6

A DE L

G

B CF

H

Page 21: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

30

Close form solution to #Z

Page 22: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

31

Close form solution to #Z

Squared Euclidean Distance between worker i and worker j in task k

Page 23: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

32

Hyper-Parameters Estimation

Hyper-parameters:

σσ = 0.2

Page 24: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

33

Experiments SettingMission I: Image Classification (Sky/Building/Computer)

Mission II: Counting Objects

Mission III: Images Aesthetics

Do these images contain sky?

Do these images look pretty?

Page 25: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

34

Statistics

402 workers

Mission ISky Building Computer(12) (12) (12)

Mission IICounting

(4)

Mission IIIImages Aesthetics

(12 + 12)

http://www.cs.cmu.edu/~yuandong/kdd2012-dataset.zipDataset link:

Page 26: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

35

The Groupsize Matrix

SmallGroup Size

LargeGroup Size

Tasks

Wor

kers

Page 27: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

36

Rank-1 Factorization

= 0.27

Page 28: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

37

Rank-1 FactorizationWorker

Reliability

Page 29: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

38Count 2: Clarity = 69.4

Tasks’ clarity

Page 30: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

39Beauty1 and Beauty2: Clarity = 12.4/11.8

Task’s clarity

Page 31: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

40

Task’s clarity

Count 4: Clarity = 10.2

Page 32: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

41

Workers’ Reliability

0

10

20

30

40

50

60

70

1.52 6.62

1.5 6.55

65 workers ~ 20% 337 workers ~ 80%Count

Page 33: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

42

Ranking Workers

Mission ISky Building Computer(12) (12) (12)

Mission IICounting

(4)

Mission IIIImages Aesthetics

(12 + 12)

D most unreliable D most reliable

Page 34: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

43

Ranking Workers

02468

1012141618

Count1 Count2 Count3 Count4

Std of D bestStd of D worst

D = 10

Std.

Page 35: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

44

Ranking Workers

02468

10121416

Count1 Count2 Count3 Count4

Std of D bestStd of D worst

D = 30

Std.

Page 36: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

45

Comparison with ClusteringD

iffer

ence

in V

aria

nce

(a) Our Approach(b) Spectral Clustering

(c) PCA-Kmeans(d) Gibbs Sampling

Page 37: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

46

Time Cost

Methods Time (sec)

(a) Our approach 1.41± 0.05(b) Spectral Clustering 3.90±0.36(c) PCA-Kmeans 0.19±0.06(d) Gibbs Sampling 53.63±0.19

Page 38: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

47

Predicting Ground truth

Count1 Count2 Count3 Count4Ours, D = 5/10 65 5 8 26

Majority Voting 53.7 5.0 9.9 22.9

Majority Voting (Median) 60 5.0 8 24Learning from Crowd

[JMLR’10]56 5 8 24

Multidimensional Wisdom of Crowds [NIPS’10]

63.7 5 8 26.0

Ground truth 65 5 8 27

Page 39: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

49

Conclusion and Future Work

Handling possible missing entriesImproving the scalability.

Future Work

Conclusion1. Estimating workers’ reliability and tasks’ clarity in the presence of schools of thought. 2. Applicable to both objective and subjective tasks. 3. Simple solution without iteration, no initial guess.

Page 40: Learning from Crowds in the Presence of Schools of Thought · 1 Learning from Crowds in the Presence of Schools of Thought Yuandong Tian1 and Jun Zhu2. 1. Carnegie Mellon University

50

Thanks!