computational social choice

Computational Social Choice WINE-13 Tutorial Dec 11, 2013 Lirong Xia

Computational Social Choice. Lirong Xia. WINE-13 Tutorial Dec 11, 2013.


Computational Social Choice

WINE-13 Tutorial

Dec 11, 2013

Lirong Xia

• The second nationwide

referendum in UK

– 1st was in 1975

• Member of Parliament election:

Plurality rule Alternative vote rule?

• 68% No vs. 32% Yes


2011 UK Referendum

Ordinal Preference Aggregation: Social Choice

> > social choice


> >

> >


A profile








Turker 1 Turker 2 Turker n

> >

Ranking pictures [PGM+ AAAI-12]






. ....

. . .

.. .

. .

. .. . .

> > AB > B C>

Social choice

R1 R1*


R2 R2*

Rn Rn*

social choice mechanism

… …


Ri, Ri*: full rankings over a set A of alternatives

Social Choice

Computational thinking + optimization algorithms




4thC. B.C.LULL






20thC.TURING et al.


21th Century

and Computer Science

PLATO et al.4thC. B.C.---20thC.

Strategic thinking + methods/principles of aggregation

Applications: real world

• People/agents often have conflicting

preferences, yet they have to make a

joint decision


• Multi-agent systems [Ephrati and Rosenschein 91]

• Recommendation systems [Ghosh et al. 99]

• Meta-search engines [Dwork et al. 01]

• Belief merging [Everaere et al. 07]

• Human computation (crowdsourcing) [Mao et al.


• etc.8

Applications: academic world

A burgeoning area

• Recently has been drawing a lot of attention– IJCAI-11: 15 papers, best paper

– AAAI-11: 6 papers, best paper

– AAMAS-11: 10 full papers, best paper runner-up

– AAMAS-12 9 full papers, best student paper

– EC-12: 3 papers

• Workshop: COMSOC Workshop 06, 08, 10, 12, 14

• Courses: – Technical University Munich (Felix Brandt)

– Harvard (Yiling Chen)

– U. of Amsterdam (Ulle Endriss)

– RPI (Lirong Xia)

• Book in progress: Handbook of Computational Social Choice 9

How to design a good social choice mechanism?


What is being “good”?

Two goals for social choice mechanisms

GOAL1: democracy


GOAL2: truth

1. Classical Social Choice3. Statistical approaches

2. Computational aspects

1. Classical Social Choice

2.1 Computational aspectsPart 1

3. Statistical approaches

45 min

55 min

75 min

2.2 Computational aspectsPart 2

5 min

5 min

30 min

30 minNP-Hard



Common voting rules(what has been done in the past two centuries)

• Mathematically, a social choice mechanism (voting rule) is a mapping from {All profiles} to {outcomes}– an outcome is usually a winner, a set of winners, or a ranking

– m : number of alternatives (candidates)

– n : number of agents (voters)

– D=(P1,…,Pn) a profile

• Positional scoring rules• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points

– The alternative with the most total points is the winner

– Special cases

• Borda, with score vector (m-1, m-2, …,0)

• Plurality, with score vector (1,0,…,0) [Used in the US]

An example

• Three alternatives {c1, c2, c3}

• Score vector (2,1,0) (=Borda)

• 3 votes,

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,

c3 gets 0+0+2=2

• The winner is c1

1 2 3c c c 2 1 3c c c 3 1 2c c c

2 1 0 2 1 0 2 1 0

• The election has two rounds

– In the first round, all alternatives except the two

with the highest plurality score drop out

– In the second round, the alternative that is

preferred by more voters wins

• [used in Iran, France, North Carolina State]


Plurality with runoff

10 7 6 3

a > b > c > da > d d > a > b > c d > a

c > d > a >bd > a b > c > d >ad >a


• Also called instant run-off voting or

alternative vote

• The election has m-1 rounds, in each round,

– The alternative with the lowest plurality score

drops out, and is removed from all votes

– The last-remaining alternative is the winner

• [used in Australia and Ireland]


Single transferable vote (STV)

10 7 6 3

a > b > c > da > c > d d > a > b > c d > a > c

c > d > a >bc > d > a b > c > d >a


c > d >aa > c a > c c > a c > a

• Kendall tau distance

– K(V,W)= # {different pairwise comparisons}

• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)

• For single winner, choose the top-ranked alternative

in Kemeny(D)

• [Has a statistical interpretation] 17

The Kemeny rule

K( b ≻ c ≻ a , a ≻ b ≻ c ) =112

• Approval, Baldwin, Black, Bucklin,

Coombs, Copeland, Dodgson, maximin,

Nanson, Range voting, Schulze, Slater,

ranked pairs, etc…


…and many others

• Q: How to evaluate rules in terms of

achieving democracy?

• A: Axiomatic approach

Axiomatic approach(what has been done in the past 50 years)

• Anonymity: names of the voters do not matter– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked alternative is always the winner– Fairness for the voters

• Neutrality: names of the alternatives do not matter– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)

• Condorcet consistency: if there exists a Condorcet winner, then it must win– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is hard

Which axiom is more important?

• Some of these axiomatic properties are not compatible with others

• Food for thought: how to evaluate partial satisfaction of axioms?

Condorcet consistency

Consistency Easy to compute

Positional scoring rules


Kemeny Y N N

Ranked pairs Y N Y

Page 22: Computational Social Choice


An easy fact

• Theorem. For voting rules that selects a single winner, anonymity is not compatible with neutrality– proof:









• Thm. No positional scoring rule is

Condorcet consistent:

– suppose s1 > s2 > s3


Another easy fact [Fishburn APSR-74]

> >

> >

> >

> >

3 Voters

2 Voters

1 Voter

1 Voter

is the Condorcet winner

: 3s1 + 2s2 + 2s3

: 3s1 + 3s2 + 1s3



Not-So-Easy facts• Arrow’s impossibility theorem

– Google it!

• Gibbard-Satterthwaite theorem

– Next section

• Axiomatic characterization

– Template: A voting rule satisfies axioms A1, A2, A2 if it is rule X

– If you believe in A1 A2 A3 are the most desirable properties then X is


– (anonymity+neutrality+consistency+continuity) positional scoring

rules [Young SIAMAM-75]

– (neutrality+consistency+Condorcet consistency) Kemeny [Young&Levenglick SIAMAM-78]

• Can we quantify a voting rule’s satisfiability

of these axiomatic properties?

– Tradeoffs between satisfiability of axioms

– Use computational techniques to design new

voting rules

• use AI techniques to automatically prove or

discover new impossibility theorems [Tang&Lin AIJ-09]


Food for thought

1. Classical Social Choice

2.1 Computational aspectsPart 1

3. Statistical approaches

45 min

55 min

75 min

2.2 Computational aspectsPart 2

5 min

5 min

15 min

30 min

• Easy to compute: – the winner can be computed in polynomial time

• Hard to manipulate: – computing a beneficial false vote is hard


Computational axioms

• Easy to compute: – the winner can be computed in polynomial time

• Hard to manipulate: – computing a beneficial false vote is hard


Computational axioms

• Almost all common voting rules, except

– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete [Hemaspaandra et

al. TCS-05]

– Young: Θ2p-complete [Rothe et al. TCS-03]

– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]

– Slater: NP-complete [Hurdy EJOR-10]

• Practical algorithms for Kemeny (also for others)

– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]

– Approximation [Ailon, Charikar, & Newman STOC-05]

– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]

– Fixed-parameter analysis [Betzler et al. TCS-09]29

Which rule is easy to compute?

• Easy to compute axiom: computing the

winner takes polynomial time in the input


– input size: nmlog m

• What if m is extremely large?


Really easy to compute?

Combinatorial domains(Multi-issue domains)

• The set of alternatives can be uniquely

characterized by multiple issues

• Let I={x1,...,xp} be the set of p issues

• Let Di be the set of values that the i-th issue

can take, then A=D1×... ×Dp

• Example:

– Issues={ Main course, Wine }

– Alternatives={ } ×{ }31

Page 32: Computational Social Choice

• In California, voters voted on 11 binary issues

( / )

– 211=2048 combinations in total

– 5/11 are about budget and taxes


Multiple referenda

• Prop.30 Increase sales

and some income tax

for education• Prop.38 Increase

income tax on almost

everyone for education

Page 33: Computational Social Choice

• Belief merging [Gabbay et al. JLC-09]

• Judgment aggregation [List and Pettit EP-02]


Other combinatorial domains

K1 merging operator



Action P Action Q Liable? (P∧Q)

Judge 1 Y Y Y

Judge 2 Y N N

Judge 3 N Y N

Majority Y Y N

Page 34: Computational Social Choice

• Easy to compute: – the winner can be computed in polynomial time

• Hard to manipulate: – computing a beneficial false vote is hard


Computational axioms

Strategic behavior (of the agents)

• Manipulation: an agent (manipulator) casts a

vote that does not represent her true

preferences, to make herself better off

• A voting rule is strategy-proof if there is never a

(beneficial) manipulation under this rule

• How important strategy-proofness is as an

desired axiomatic property?

– compared to other axiomatic properties

Page 36: Computational Social Choice

Manipulation under plurality rule (ties are broken in favor of )

> >

> >

> >


Plurality ruleAlice



Page 37: Computational Social Choice

Any strategy-proof voting rule?

• No reasonable voting rule is strategyproof

• Gibbard-Satterthwaite Theorem [Gibbard Econometrica-73,

Satterthwaite JET-75]: When there are at least three

alternatives, no voting rules except dictatorships satisfy

– non-imposition: every alternative wins for some


– unrestricted domain: voters can use any linear order

as their votes

– strategy-proofness

• Axiomatic characterization for dictatorships!

Page 38: Computational Social Choice

• Relax non-dictatorship: use a dictatorship

• Restrict the number of alternatives to 2

• Relax unrestricted domain: mainly pursued

by economists

– Single-peaked preferences:

– Approval voting: A voter submit 0 or 1 for each



A few ways out

Page 39: Computational Social Choice

• Use a voting rule that is too complicated so that

nobody can easily predict the winner

– Dodgson

– Kemeny

– The randomized voting rule used in Venice Republic for

more than 500 years [Walsh&Xia AAMAS-12]

• We want a voting rule where

– Winner determination is easy

– Manipulation is hard43

Computational thinking

Manipulation is inevitable(Gibbard-Satterthwaite Theorem)



Limited information

Limited communication

Can we use computational complexity as a barrier?

Is it a strong barrier?

Other barriers?

May lead to very undesirable outcomes

Seems not very often

Why prevent manipulation?

How often?

If it is computationally too hard for a

manipulator to compute a manipulation,

she is best off voting truthfully

– Similar as in cryptography

For which common voting rules

manipulation is computationally hard?


Manipulation: A computational complexity perspective


• Initiated by [Bartholdi, Tovey, &Trick SCW-89b]

• Votes are weighted or unweighted

• Bounded number of alternatives [Conitzer, Sandholm, &Lang JACM-


– Unweighted manipulation: easy for most common rules

– Weighted manipulation: depends on the number of


• Unbounded number of alternatives (next few slides)

• Assuming the manipulators have complete

information! 46

Computing a manipulation

Page 43: Computational Social Choice

Unweighted coalitional manipulation (UCM) problem

• Given

– The voting rule r

– The non-manipulators’ profile PNM

– The number of manipulators n’

– The alternative c preferred by the manipulators

• We are asked whether or not there exists a profile PM (of the manipulators) such that c is the winner of PNM∪PM under r


The stunningly big table for UCM

#manipulators One manipulator At least two

Copeland P [BTT SCW-89b] NPC [FHS AAMAS-08,10]


Veto P [ZPR AIJ-09] P [ZPR AIJ-09]

Plurality with runoff P [ZPR AIJ-09] P [ZPR AIJ-09]

Cup P [CSL JACM-07] P [CSL JACM-07]

Borda P [BTT SCW-89b] NPC[DKN+ AAAI-11][BNW IJCAI-11]

Maximin P [BTT SCW-89b] NPC [XZP+ IJCAI-09]

Ranked pairs NPC [XZP+ IJCAI-09] NPC [XZP+ IJCAI-09]

Bucklin P [XZP+ IJCAI-09] P [XZP+ IJCAI-09]

Nanson’s rule NPC [NWX AAA-11] NPC [NWX AAA-11]

Baldwin’s rule NPC [NWX AAA-11] NPC [NWX AAA-11]

• For some common voting rules,

computational complexity provides some

protection against manipulation

• Is computational complexity a strong


– NP-hardness is a worst-case concept


What can we conclude?

Probably NOT a strong barrier1. Frequency of


2. Easiness of

Approximation3. Quantitative G-S

AAMAS-14 workshop

Computational Social Choice: Beyond the Worst Case

• Non-manipulators’ votes are drawn i.i.d.

– E.g. i.i.d. uniformly over all linear orders (the

impartial culture assumption)

• How often can the manipulators make c

win?– Specific voting rules [Peleg T&D-79, Baharad&Neeman

RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and

Rosenschein AAMAS-07]51

A first angle: frequency of manipulability

• Theorem. For any generalized scoring rule

– Including many common voting rules

• Computational complexity is not a strong barrier against


– UCM as a decision problem is easy to compute in most cases

– The case of Θ(√n) has been studied experimentally in [Walsh


A general result [Xia&Conitzer EC-08a]

# manipulatorsAll-powerful

No powerΘ(√n)

• Unweighted coalitional optimization

(UCO): compute the smallest number of

manipulators that can make c win

– A greedy algorithm has additive error no more

than 1 for Borda [Zuckerman, Procaccia,

&Rosenschein AIJ-09]


A second angle: approximation

• A polynomial-time approximation algorithm that

works for all positional scoring rules

– Additive error is no more than m-2

– Based on a new connection between UCO for positional

scoring rules and a class of scheduling problems

• Computational complexity is not a strong barrier

against manipulation

– The cost of successful manipulation can be easily

approximated (for positional scoring rules)


An approximation algorithm for positional scoring rules[Xia,Conitzer,& Procaccia EC-10]

Page 51: Computational Social Choice

The scheduling problems Q|pmtn|Cmax

• m* parallel uniform machines M1,…,Mm*

– Machine i’s speed is si (the amount of work done in unit time)

• n* jobs J1,…,Jn*

• preemption: jobs are allowed to be interrupted (and resume later maybe on another machine)

• We are asked to compute the minimum makespan– the minimum time to complete all jobs 55

Thinking about UCOpos

• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1

obtain in the non-manipulators’ profile

p c










p1 -pp1 –p-(s1-s2)

p p2 -pp2 –p-(s1-s4)

p p3 -pp3 –p-(s1-s3)









The approximation algorithm

Original UCOScheduling


Solution to the

scheduling problem

Solution to the



JACM 78]


No more than


• Manipulation of positional scoring rules =

scheduling (preemptions at integer time points)

– Borda manipulation corresponds to scheduling

where the machines speeds are m-1, m-2, …, 0

• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]

– UCM for Borda is NP-C for two manipulators

• [Davies et al. AAAI-11 best paper]

• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]


Complexity of UCM for Borda

• G-S theorem: for any reasonable voting rule there

exists a manipulation

• Quantitative G-S: for any voting rule that is “far

away” from dictatorships, the number of

manipulable situations is non-negligible

– First work: 3 alternatives, neutral rule [Friedgut, Kalai,

&Nisan FOCS-08]

– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer EC-08b,

Isaksson,Kindler,&Mossel FOCS-10]

– Finally proved: [Mossel&Racz STOC-12]59

A third angle: quantitative G-S

• The first attempt seems to fail

• Can we obtain positive results for a

restricted setting?

– The manipulators has complete information

about the non-manipulators’ votes

– The manipulators can perfectly discuss their



Next steps

• Limiting the manipulator’s information can

make dominating manipulation computationally

harder, or even impossible [Conitzer,Walsh,&Xia


• Bayesian information [Lu et al. UAI-12]61

Limited information

Page 58: Computational Social Choice

• The leader-follower model

– The leader broadcast a vote W, and the potential

followers decide whether to cast W or not

• The leader and followers have the same preferences

– Safe manipulation [Slinko&White COMSOC-08]: a vote W


• No matter how many followers there are, the leader/potential

followers are not worse off

• Sometimes they are better off

– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]62

Limited communication among manipulators

• Procedure control by

– {adding, deleting} × {voters, alternatives}

– partitioning voters/alternatives

– introducing clones of alternatives

– changing the agenda of voting– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI-09]

• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]

• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a

survey on their computational complexity

• See [Xia Axriv-12] for a framework for studying many of these

for generalized scoring rules 70

Other types of strategic behavior

(of the chairperson)

Food for thought

• The problem is still very open!

– Shown to be connected to integer factorization [Hemaspaandra, Hemaspaandra, & Menton STACS-13]

• What is the role of computational complexity in

analyzing human/self-interested agents’ behavior?

– Explore information/communication assumptions

– In general, why do we want to prevent strategic


• Practical ways to protect elections

1. Classical Social Choice

2.1 Computational aspectsPart 1

3. Statistical approaches

45 min

55 min

75 min

2.2 Computational aspectsPart 2

5 min

5 min

15 min

30 min

Turker 1 Turker 2 Turker n

> >

Ranking pictures [PGM+ AAAI-12]






. ....

. . .

.. .

. .

. .. . .

> > AB > B C>

Two goals for social choice mechanisms

GOAL1: democracy


GOAL2: truth

1. Classical Social Choice3. Statistical approaches

2. Computational aspects

Outline: statistical approaches


Condorcet’s MLE model(history)

A General framework

Why MLE? Why Condorcet’s


Random Utility Models

Model selection

The Condorcet Jury theorem.

• Given

– two alternatives {a,b}.

– 0.5<p<1,

• Suppose

– each agent’s preferences is generated i.i.d., such that

– w/p p, the same as the ground truth

– w/p 1-p, different from the ground truth

• Then, as n→∞, the majority of agents’ preferences

converges in probability to the ground truth76

The Condorcet Jury theorem [Condorcet 1785]

Condorcet’s MLE approach• Parametric ranking model Mr: given a “ground truth” parameter Θ

– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)

– Each P is a ranking

• For any profile D=(P1,…,Pn),

– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)

– The MLE mechanism

MLE(D)=argmaxΘ L(Θ|D)

– Break ties randomly

• What if Decision space ≠ Parameter space?

“Ground truth” Θ

P1 P2 Pn…


• Condorcet was not very clear how the Condorcet Jury theorem can

be extended to m>2

• Young had an interpretation [Young APSR-1988]

• Parameter space

– all combinations of opinions: an opinion is a pairwise comparison between

candidates (can be cyclic)

– p<1

• Sample space

– all combinations of opinions

• Given “ground truth” opinions W and p<1, generate opinions V s.t.

each opinion is i.i.d.


Condorcet’s model

c≻d in W

c≻d in Vp

d≻c in V1-p

Page 68: Computational Social Choice

• Parameter space

– all rankings over candidates

– ϕ<1

• Sample space

– all rankings over candidates

• Given a “ground truth” ranking W and ϕ<1,

generate a ranking V w.p.

– Pr(V|W) ∝ ϕ Kendall(V,W)

• MLE ranking is the Kemeny rule 79

Mallows model [Mallows 1957]

Page 69: Computational Social Choice

• Learning [Lu and Boutilier ICML-11]

• Approximation by common voting rules [Caragiannis, Procaccia & Shah EC-13]


Recent studies on Condorcet/Mallows model

Outline: statistical approaches


Condorcet/Mallows model(history)

Why MLE? Why Condorcet’s


A General framework

Statistical decision framework[Azari, Parkes, and Xia WINE poster]

ground truth Θ

P1 Pn……


Decision (winner, ranking, etc)

Information about theground truth

P1 P2 Pn…

Step 1: statistical inference

Data D

Given MrStep 2: decision making

Page 72: Computational Social Choice

Mr = Mallows model

Step 1: MLE

Step 2: top-alternative


Example: Kemeny


The most probable ranking

P1 P2 Pn…

Step 1: MLE

Data D

Step 2: top-1 alternativeDecision space:

A unique winner

Page 73: Computational Social Choice

• Likelihood reasoning

– there is an unknown

but fixed ground truth

– p = 10/14=0.714

– Pr(2heads|p=0.714)


– Yes!84

Likelihood reasoning vs. Bayesian in general

• Bayesian

– the ground truth is

captured by a belief


– Compute Pr(p|Data)

assuming uniform prior

– Compute Pr(2heads|


– No!

Credit: Panos Ipeirotis

& Roy Radner

• You have a biased coin: head w/p p

– You observe 10 heads, 4 tails

– Do you think the next two tosses will be two heads in a row?

Page 74: Computational Social Choice

Mr = Mallows model


Kemeny = Likelihood approach


The most probable ranking

P1 P2 Pn…

Step 1: MLE

Data D

Step 2: top-1 alternative

This is the Kemeny rule

(for single winner)!

Mr = Condorcet model

Step 1: compute the likelihood

for all parameters (opinions)

Step 2: choose the top-

alternative of the most

probable ranking


Kemeny = Likelihood approach (2)


The most probable ranking

P1 P2 Pn…

Step 1: compute the likelihood

Data D

Step 2: top-1 alternative

Example: Bayesian [Young APSR-88]

Mr = Condorcet’ model


Posterior over rankings

P1 P2 Pn…

Step 1: Bayesian update

Data D

Step 2: mostly likely top-1

Anonymity, neutrality,

monotonicityConsistency Condorcet

Easy to compute



Bayesian(Condorcet) N Y


Likelihood vs. Bayesian[Azari, Parkes, and Xia WINE poster]

Decision space: single winners

Assume uniform prior in the Bayesian approach

Principle: Statistical decision theory

Outline: statistical approaches


Condorcet’s MLE model(history)

Why MLE? Why Condorcet’s


A General framework

• When the outcomes are winning alternatives

– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,

then r(D1∪D2)=r(D1)∩r(D2)

– All classical voting rules except positional scoring

rules are NOT MLEs

• Positional scoring rules are MLEs

• This is NOT a coincidence!

– All MLE rules that outputs winners satisfy anonymity and


– Positional scoring rules are the only voting rules that satisfy

anonymity, neutrality, and consistency! [Young SIAMAM-75]


Classical voting rules as MLEs [Conitzer&Sandholm UAI-05]

Page 80: Computational Social Choice

• When the outcomes are winning rankings

– MLE rules must satisfy reinforcement (the

counterpart of consistency for rankings)

– All classical voting rules except positional scoring

rules and Kemeny are NOT MLEs

• This is not (completely) a coincidence!

– Kemeny is the only preference function (that outputs

rankings) that satisfies neutrality, reinforcement, and

Condorcet consistency [Young&Levenglick SIAMAM-78]91

Classical voting rules as MLEs [Conitzer&Sandholm UAI-05]

• Condorcet’s model

– not very natural

– computationally hard

• Other classic voting rules

– Most are not MLEs

– Models are not very natural either


Are we happy?

New mechanisms via the statistical decision framework

Model selection

– How can we evaluate fitness?

• Likelihood or Bayesian?

– Focus on MLE

• Computation

– How can we compute MLE efficiently?



Information about theground truth

Data D

decision making


Page 83: Computational Social Choice

• Closely related, but

– We need economic insight to build the model

– We care about satisfaction of traditional social

choice criteria

• Also want to reach a compromise (achieve



Why not just a problem of machine learning or statistics?

Outline: statistical approaches


Condorcet’s MLE model(history)

A General framework

Why MLE? Why Condorcet’s


Random Utility Models

Page 85: Computational Social Choice

• Continuous parameters: Θ=(θ1,…, θm)

– m: number of alternatives

– Each alternative is modeled by a utility distribution μi

– θi: a vector that parameterizes μi

• An agent’s perceived utility Ui for alternative ci is generated

independently according to μi(Ui)

• Agents rank alternatives according to their perceived utilities

– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)


Random utility model (RUM)[Thurstone 27]

U1 U2U3

θ3 θ2 θ1

Page 86: Computational Social Choice

• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)


Generating a preference-profile


P1= c2≻c1≻c3Pn= c1≻c2≻c3…

Agent 1 Agent n

θ3 θ2 θ1

Page 87: Computational Social Choice

• μi’s are Gumbel distributions

– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]

• Equivalently, there exist positive numbers λ1,…,λm

• Pros:

– Computationally tractable

• Analytical solution to the likelihood function

– The only RUM that was known to be tractable

• Widely applied in Economics [McFadden 74], learning to rank [Liu 11], and

analyzing elections [GM 06,07,08,09]

• Cons: does not seem to fit very well98

RUMs with Gumbel distributions

c1 is the top choice in { c1,…,cm }c2 is the top choice in { c2,…,cm }cm-1 is preferred to cm

Page 88: Computational Social Choice

• μi’s are normal distributions

– Thurstone’s Case V [Thurstone 27]

• Pros:

– Intuitive

– Flexible

• Cons: believed to be computationally intractable

– No analytical solution for the likelihood function Pr(P |

Θ) is known


RUM with normal distributions

Um: from -∞ to ∞ Um-1: from Um to ∞ … U1: from U2 to ∞

Page 89: Computational Social Choice

• Location family: RUMs where each μi is parameterized

by its mean θi

– Normal distributions with fixed variance

– P-L

• Theorem. For any RUM in the location family, if the PDF

of each μi is log-concave, then for any preference-profile

D, the likelihood function Pr(D|Θ) is log-concave

– Local optimality = global optimality

– The set of global maxima solutions is convex100

Unimodality of likelihood[APX. NIPS-12]

Page 90: Computational Social Choice

• Utility distributions μl’s belong to the exponential family


– Includes normal, Gamma, exponential, Binomial, Gumbel, etc.

• In each iteration t

• E-step, for any set of parameters Θ

– Computes the expected log likelihood (ELL)

ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))

• M-step

– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)

• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε101

Approximately computed

by Gibbs sampling

MC-EM algorithm for RUMs [APX NIPS-12]

Page 91: Computational Social Choice

Outline: statistical approaches


Condorcet’s MLE model(history)

A General framework

Why MLE? Why Condorcet’s


Random Utility Models

Model selection

Model selection

Value(Normal)- Value(PL)


44.8(15.8) 87.4(30.5) -79.6(31.6) -50.5(31.6)

• Compare RUMs with Normal distributions and PL for

– log-likelihood

– predictive log-likelihood,

– Akaike information criterion (AIC),

– Bayesian information criterion (BIC)

• Tested on an election dataset

– 9 alternatives, randomly chosen 50 voters

Red: statistically significant with 95% confidence

Page 93: Computational Social Choice

• Generalized RUM [APX UAI-13]

– Learn the relationship between agent features

and alternative features

• Preference elicitation based on experimental

design [APX UAI-13]

– c.f. active learning

• Faster algorithms [ACPX, NIPS-13]

– Generalized Method of Moments (GMM)104

Recent progress

Computational thinking + optimization algorithms



Strategic thinking + methods/principles of aggregation

2. Computational aspects 3. Statistical approaches

• Easy-to-compute axiom• Hard-to-manipulate axiom

• Computational thinking +

game-theoretic analysis

• Framework based on

statistical decision theory• Model selection

• Condorcet/Mallows vs. RUM

Thank you!