wisdom of crowds and rank aggregation

Wisdom of Crowds and Rank Aggregation

Mark SteyversDepartment of Cognitive Sciences

University of California, Irvine

Joint work with:Brent Miller, Pernille Hemmer, Mike Yi, Michael Lee

Wisdom of crowds phenomenon

Aggregating over individuals in a group often leads to an estimate that is better than any of the individual estimates

2

Examples of wisdom of crowds phenomenon

3

Galton’s Ox (1907): Median of individual weight estimates came close to true answer

Prediction markets

Ulysses S. Grant

James Garfield

Rutherford B. Hayes

Abraham Lincoln

Andrew Johnson

James Garfield

Ulysses S. Grant

Rutherford B. Hayes

Andrew Johnson

Abraham Lincoln

Our research: ranking problems

time

What is the correct chronological order?

Aggregating ranking data

5

D A B C A B D C B A D C A C B D A D B C

Aggregation Algorithm

A B C D A B C D

ground truth

=?

group answer

Task constraints

No communication between individuals

There is always a true answer (ground truth)

Unsupervised algorithms no feedback is available ground truth only used for evaluation

6

Classic models: Thurstone (1927) Mallows (1957); Fligner and Verducci, 1986 Diaconis (1989) Voting methods: e.g. Borda count (1770)

Machine learning applications Information retrieval and meta-search

e.g. Klementiev, Roth et al. (2008; 2009), Lebanon & Mao (2008); Dwork et al. (2001)

multi-object tracking e.g. Huan, Guestrin, Guibas (2009); Kondor, Howard, Jebara (2007)

Unsupervised models for ranking data

7Many models were developed for preference rankings and voting situations no known ground truth

Unsupervised Approach

8

D A B C A B D C B A D C A C B D A D B C

Generative Model

? ? ? ?

latent ground truth

Incorporate individual differences

Overview of talk

Reconstruct the order of US presidents

Effect of group size and expertise

Reconstruct the order of events

Traveling Salesman Problem

9

Experiment: 26 individuals order all 44 US presidents

10

George Washington John Adams Thomas Jefferson James Madison

James Monroe John Quincy Adams Andrew Jackson Martin Van Buren

William Henry Harrison John Tyler James Knox Polk Zachary Taylor

Millard Fillmore Franklin Pierce James Buchanan Abraham Lincoln

Andrew Johnson Ulysses S. Grant Rutherford B. Hayes James Garfield

Chester Arthur Grover Cleveland 1 Benjamin Harrison Grover Cleveland 2

William McKinley Theodore Roosevelt William Howard Taft Woodrow Wilson

Warren Harding Calvin Coolidge Herbert Hoover Franklin D. Roosevelt

Harry S. Truman Dwight Eisenhower John F. Kennedy Lyndon B. Johnson

Richard Nixon Gerald Ford James Carter Ronald Reagan

George H.W. Bush William Clinton George W. Bush Barack Obama

= 1= 1+1Measuring performance

Kendall’s Tau: The number of adjacent pair-wise swaps

Ordering by IndividualA B E C D

True OrderA B C D E

C DEA B

A B E C D

A B C D E= 2

Empirical Results

12

1 10 200

100

200

300

400

500

Individuals (ordered from best to worst)

(random guessing)

Thurstonian Model

13

A. George Washington

B. James Madison

C. Andrew Jackson

Each item has a true coordinate on some dimension

Thurstonian Model

14

… but there is noise because of encoding errors


B. James Madison

C. Andrew Jackson

Thurstonian Model

15


B. James Madison

C. Andrew Jackson

Each person’s mental encoding is based on a single sample from each distribution

A

B

C

Thurstonian Model

16


B. James Madison

C. Andrew Jackson

A

B

C

A < C < B

The observed ordering is based on the ordering of the samples

Thurstonian Model

17


B. James Madison

C. Andrew Jackson

A

B

C

A < B < C

The observed ordering is based on the ordering of the samples

Thurstonian Model

18


B. James Madison

C. Andrew Jackson

Important assumption: across individuals, standard deviation can vary but not the means

Graphical Model of Extended Thurstonian Model

19

j individuals

jx

jy

μ

j

| , ~ N ,ij j jx

( )j jranky x

~ Gamma ,1 /j

Latent group means

Individual noise level

Mental representation

Observed ordering

Inferred Distributions for 44 US Presidents

20

George Washington (1)John Adams (2)

Thomas Jefferson (3)James Madison (4)James Monroe (6)

John Quincy Adams (5)Andrew Jackson (7)

Martin Van Buren (8)William Henry Harrison (21)

John Tyler (10)James Knox Polk (18)

Zachary Taylor (16)Millard Fillmore (11)Franklin Pierce (19)

James Buchanan (13)Abraham Lincoln (9)

Andrew Johnson (12)Ulysses S. Grant (17)

Rutherford B. Hayes (20)James Garfield (22)Chester Arthur (15)

Grover Cleveland 1 (23)Benjamin Harrison (14)

Grover Cleveland 2 (25)William McKinley (24)

Theodore Roosevelt (29)William Howard Taft (27)

Woodrow Wilson (30)Warren Harding (26)Calvin Coolidge (28)Herbert Hoover (31)

Franklin D. Roosevelt (32)Harry S. Truman (33)

Dwight Eisenhower (34)John F. Kennedy (37)

Lyndon B. Johnson (36)Richard Nixon (39)

Gerald Ford (35)James Carter (38)

Ronald Reagan (40)George H.W. Bush (41)

William Clinton (42)George W. Bush (43)

Barack Obama (44)

error bars = median and minimum sigma

Calibration of individuals

21

0 0.1 0.2 0.3 0.450

100

150

200

250

300

R=0.941

inferred noise level for

each individual

distance to ground

truth

individual

Wisdom of crowds effect

22

1 10 200

50

100

150

200

250

300

350

Individuals

Thurstonian ModelPerturbationIndividuals

Alternative Heuristic Models

Many heuristic methods from voting theory E.g., Borda count method

Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count

i.e., rank by average rank across people

23

Model Comparison

24

1 10 20 300

50

100

150

200

250

300

350

Individuals

Thurstonian ModelPerturbationBorda countIndividuals

Borda

Overview of talk





25

Experiment

78 participants 17 ordering problems each with 10 items

Chronological Events Physical Measures Purely ordinal problems, e.g.

Ten Amendments Ten commandments

26

Ordering states west-east

27

Oregon (1)

Utah (2)

Nebraska (3)

Iowa (4)

Alabama (6)

Ohio (5)

Virginia (7)

Delaware (8)

Connecticut (9)

Maine (10)

Ordering Ten Amendments

28

Freedom of speech & religion (1)

Right to bear arms (2)

No quartering of soldiers (4)

No unreasonable searches (3)

Due process (5)

Trial by Jury (6)

Civil Trial by Jury (7)

No cruel punishment (8)

Right to non-specified rights (10)

Power for the States & People (9)

Ordering Ten Commandments

29

Worship any other God (1)

Make a graven image (7)

Take the Lord's name in vain (2)

Break the Sabbath (3)

Dishonor your parents (4)

Murder (6)

Commit adultery (8)

Steal (5)

Bear false witness (9)

Covet (10)

Effect of Group Size: random subgroups

30

0 10 20 30 40 50 60 70 807

8

9

10

11

12

13

14

Group Size

T=0T=2

T=12

How effective are small groups of experts?

Want to find experts endogenously – without feedback

Approach: select individuals with the smallest estimated noise levels based on previous tasks

We are identifying general expertise (“Pearson’s g”)

31

Group Composition based on prior performance

32

0 10 20 30 40 50 60 70 807

8

9

10

11

12

13

14

Group Size

T=0T=2

T=12

T = 0

# previous tasks

T = 2T = 8

Group size (best individuals first)

33

Endogenous no feedback

required

Exogenous selecting people based on

actual performance

0 10 20 30 407

8

9

10

11

12

13

14

0 20 407

8

9

10

11

12

13

14

Overview of talk





34

Recollecting Order from Episodic Memory

35

Study this sequence of images

Place the images in correct sequence (serial recall)

36

A

B

C

D

E

F

G

H

I

J

Average results across 6 problems

37

Mea

n

1 10 20 300

5

10

15

Individuals

Thurstonian ModelPerturbation ModelBorda countIndividuals

Calibration of individuals

38

0 2 4 60

5

10

15

20

25

30

R=0.920

inferred noise level

distance to ground

truth

individual

(pizza sequence; perturbation model)

Overview of talk





39

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

28

29

30

B30-21Find the shortest route between cities

40

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

28

29

30

B30-21 - subj 5

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

28

29

30

B30-21 - subj 83

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

28

29

30

B30-21 - subj 60

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

28

29

30

B30-21

B30-21

Individual 5 Individual 83 Individual 60Optimal

Dataset Vickers, Bovet, Lee, & Hughes (2003)

83 participants 7 problems of 30 cities

TSP Aggregation Problem

Data consists of city order only No access to city locations

42

Heuristic Approach

Idea: find tours with edges for which many individuals agree

Calculate agreement matrix A A = n × n matrix, where n is the number of cities aij indicates the number of participants that connect cities i and j.

Find tour that maximizes

43

tourji

cija

),(

(this itself is a non-Euclidian TSP problem)

Line thickness = agreement

44

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

28

29

30

B30-21Blue = Aggregate Tour

45

Results averaged across 7 problems

0

2

4

6

8

10

12

14

16

18

Per

cent

ove

r Opt

imal

aggregate

Summary Combine ordering / ranking data

going beyond numerical estimates or multiple choice questions

Incorporate individual differences assume some individuals might be “experts” going beyond models that treat every vote equally

Applications combine multiple eyewitness accounts combine solutions in complex problem-solving situations fantasy football

47

That’s all

48

Do the experiments yourself:

http://psiexp.ss.uci.edu/

http://psiexp.ss.uci.edu/

Predictive Rankings: fantasy football

49

South Australian Football League (32 people rank 9 teams)

1 10 20 300

20

40

60

80

Individuals

Thurstonian ModelPerturbation ModelBorda countIndividuals

Australian Football League (29 people rank 16 teams)

1 10 20 300

5

10

15

20

25

Individuals

1 10 20 300

20

40

60

80

0.8 1 1.2 1.4 1.6 1.8

0

2

4

6

8

10

12

14

16

18R=-0.752

1

2

3

4

5

6

7

8

9

10

1112

13

14

15

16

17

Predicting problem difficulty

50

std

dispersion of noise levels across individual

distance of group

answer to ground truth

ordering states geographically

city size rankings

Related Concepts in Supervised Learning

Boosting combining multiple classifiers

Bagging (Bootstrap Aggregating)

51

wisdom of crowds and rank aggregation

Documents

b ca b d cb

chronological position

order of events

correct chronological

latent ground truth

availableground truth

unsupervised models

individual estimates