computational social choice
DESCRIPTION
Computational Social Choice. Lirong Xia. WINE-13 Tutorial Dec 11, 2013. 2011 UK Referendum. The second nationwide referendum in UK 1 st was in 1975 Member of Parliament election: Plurality rule Alternative vote rule? 68% No vs. 32% Yes. - PowerPoint PPT PresentationTRANSCRIPT
Computational Social Choice
WINE-13 Tutorial
Dec 11, 2013
Lirong Xia
• The second nationwide
referendum in UK
– 1st was in 1975
• Member of Parliament election:
Plurality rule Alternative vote rule?
• 68% No vs. 32% Yes
2
2011 UK Referendum
Ordinal Preference Aggregation: Social Choice
> > social choice
mechanism
> >
> >
3
A profile
Carol
Alice
Bob
A B C
A B C
ACB
A
4
A B C
A B C
Turker 1 Turker 2 Turker n
…
> >
Ranking pictures [PGM+ AAAI-12]
...
.
.
.
....
. ....
. . .
.. .
. .
. .. . .
> > AB > B C>
5
Social choice
R1 R1*
Outcome
R2 R2*
Rn Rn*
social choice mechanism
… …
Profile
Ri, Ri*: full rankings over a set A of alternatives
Social Choice
Computational thinking + optimization algorithms
CSSocial
Choice
6PLATO
4thC. B.C.LULL
13thC.BORDA
18thC.
CONDORCET
18thC.
ARROW
20thC.TURING et al.
20thC.
21th Century
and Computer Science
PLATO et al.4thC. B.C.---20thC.
Strategic thinking + methods/principles of aggregation
Applications: real world
• People/agents often have conflicting
preferences, yet they have to make a
joint decision
7
• Multi-agent systems [Ephrati and Rosenschein 91]
• Recommendation systems [Ghosh et al. 99]
• Meta-search engines [Dwork et al. 01]
• Belief merging [Everaere et al. 07]
• Human computation (crowdsourcing) [Mao et al.
AAAI-13]
• etc.8
Applications: academic world
A burgeoning area
• Recently has been drawing a lot of attention– IJCAI-11: 15 papers, best paper
– AAAI-11: 6 papers, best paper
– AAMAS-11: 10 full papers, best paper runner-up
– AAMAS-12 9 full papers, best student paper
– EC-12: 3 papers
• Workshop: COMSOC Workshop 06, 08, 10, 12, 14
• Courses: – Technical University Munich (Felix Brandt)
– Harvard (Yiling Chen)
– U. of Amsterdam (Ulle Endriss)
– RPI (Lirong Xia)
• Book in progress: Handbook of Computational Social Choice 9
How to design a good social choice mechanism?
10
What is being “good”?
Two goals for social choice mechanisms
GOAL1: democracy
11
GOAL2: truth
1. Classical Social Choice3. Statistical approaches
2. Computational aspects
12
Outline
1. Classical Social Choice
2.1 Computational aspectsPart 1
3. Statistical approaches
45 min
55 min
75 min
2.2 Computational aspectsPart 2
5 min
5 min
30 min
30 minNP-Hard
NP-Hard
NP-Hard
Common voting rules(what has been done in the past two centuries)
• Mathematically, a social choice mechanism (voting rule) is a mapping from {All profiles} to {outcomes}– an outcome is usually a winner, a set of winners, or a ranking
– m : number of alternatives (candidates)
– n : number of agents (voters)
– D=(P1,…,Pn) a profile
• Positional scoring rules• A score vector s1,...,sm
– For each vote V, the alternative ranked in the i-th position gets si points
– The alternative with the most total points is the winner
– Special cases
• Borda, with score vector (m-1, m-2, …,0)
• Plurality, with score vector (1,0,…,0) [Used in the US]
An example
• Three alternatives {c1, c2, c3}
• Score vector (2,1,0) (=Borda)
• 3 votes,
• c1 gets 2+1+1=4, c2 gets 1+2+0=3,
c3 gets 0+0+2=2
• The winner is c1
1 2 3c c c 2 1 3c c c 3 1 2c c c
2 1 0 2 1 0 2 1 0
• The election has two rounds
– In the first round, all alternatives except the two
with the highest plurality score drop out
– In the second round, the alternative that is
preferred by more voters wins
• [used in Iran, France, North Carolina State]
15
Plurality with runoff
10 7 6 3
a > b > c > da > d d > a > b > c d > a
c > d > a >bd > a b > c > d >ad >a
d
• Also called instant run-off voting or
alternative vote
• The election has m-1 rounds, in each round,
– The alternative with the lowest plurality score
drops out, and is removed from all votes
– The last-remaining alternative is the winner
• [used in Australia and Ireland]
16
Single transferable vote (STV)
10 7 6 3
a > b > c > da > c > d d > a > b > c d > a > c
c > d > a >bc > d > a b > c > d >a
a
c > d >aa > c a > c c > a c > a
• Kendall tau distance
– K(V,W)= # {different pairwise comparisons}
• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)
• For single winner, choose the top-ranked alternative
in Kemeny(D)
• [Has a statistical interpretation] 17
The Kemeny rule
K( b ≻ c ≻ a , a ≻ b ≻ c ) =112
• Approval, Baldwin, Black, Bucklin,
Coombs, Copeland, Dodgson, maximin,
Nanson, Range voting, Schulze, Slater,
ranked pairs, etc…
18
…and many others
19
• Q: How to evaluate rules in terms of
achieving democracy?
• A: Axiomatic approach
20
Axiomatic approach(what has been done in the past 50 years)
• Anonymity: names of the voters do not matter– Fairness for the voters
• Non-dictatorship: there is no dictator, whose top-ranked alternative is always the winner– Fairness for the voters
• Neutrality: names of the alternatives do not matter– Fairness for the alternatives
• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)
• Condorcet consistency: if there exists a Condorcet winner, then it must win– A Condorcet winner beats all other alternatives in pairwise elections
• Easy to compute: winner determination is in P– Computational efficiency of preference aggregation
• Hard to manipulate: computing a beneficial false vote is hard
21
Which axiom is more important?
• Some of these axiomatic properties are not compatible with others
• Food for thought: how to evaluate partial satisfaction of axioms?
Condorcet consistency
Consistency Easy to compute
Positional scoring rules
N Y Y
Kemeny Y N N
Ranked pairs Y N Y
22
An easy fact
• Theorem. For voting rules that selects a single winner, anonymity is not compatible with neutrality– proof:
>
>
>
>
≠W.O.L.G.
NeutralityAnonymity
Alice
Bob
• Thm. No positional scoring rule is
Condorcet consistent:
– suppose s1 > s2 > s3
23
Another easy fact [Fishburn APSR-74]
> >
> >
> >
> >
3 Voters
2 Voters
1 Voter
1 Voter
is the Condorcet winner
: 3s1 + 2s2 + 2s3
: 3s1 + 3s2 + 1s3
<
CONTRADICTION
24
Not-So-Easy facts• Arrow’s impossibility theorem
– Google it!
• Gibbard-Satterthwaite theorem
– Next section
• Axiomatic characterization
– Template: A voting rule satisfies axioms A1, A2, A2 if it is rule X
– If you believe in A1 A2 A3 are the most desirable properties then X is
optimal
– (anonymity+neutrality+consistency+continuity) positional scoring
rules [Young SIAMAM-75]
– (neutrality+consistency+Condorcet consistency) Kemeny [Young&Levenglick SIAMAM-78]
• Can we quantify a voting rule’s satisfiability
of these axiomatic properties?
– Tradeoffs between satisfiability of axioms
– Use computational techniques to design new
voting rules
• use AI techniques to automatically prove or
discover new impossibility theorems [Tang&Lin AIJ-09]
25
Food for thought
26
Outline
1. Classical Social Choice
2.1 Computational aspectsPart 1
3. Statistical approaches
45 min
55 min
75 min
2.2 Computational aspectsPart 2
5 min
5 min
15 min
30 min
• Easy to compute: – the winner can be computed in polynomial time
• Hard to manipulate: – computing a beneficial false vote is hard
27
Computational axioms
• Easy to compute: – the winner can be computed in polynomial time
• Hard to manipulate: – computing a beneficial false vote is hard
28
Computational axioms
• Almost all common voting rules, except
– Kemeny: NP-hard [Bartholdi et al. 89], Θ2p-complete [Hemaspaandra et
al. TCS-05]
– Young: Θ2p-complete [Rothe et al. TCS-03]
– Dodgson: Θ2p-complete [Hemaspaandra et al. JACM-97]
– Slater: NP-complete [Hurdy EJOR-10]
• Practical algorithms for Kemeny (also for others)
– ILP [Conitzer, Davenport, & Kalagnanam AAAI-06]
– Approximation [Ailon, Charikar, & Newman STOC-05]
– PTAS [Kenyon-Mathieu and W. Schudy STOC-07]
– Fixed-parameter analysis [Betzler et al. TCS-09]29
Which rule is easy to compute?
• Easy to compute axiom: computing the
winner takes polynomial time in the input
size
– input size: nmlog m
• What if m is extremely large?
30
Really easy to compute?
Combinatorial domains(Multi-issue domains)
• The set of alternatives can be uniquely
characterized by multiple issues
• Let I={x1,...,xp} be the set of p issues
• Let Di be the set of values that the i-th issue
can take, then A=D1×... ×Dp
• Example:
– Issues={ Main course, Wine }
– Alternatives={ } ×{ }31
• In California, voters voted on 11 binary issues
( / )
– 211=2048 combinations in total
– 5/11 are about budget and taxes
32
Multiple referenda
• Prop.30 Increase sales
and some income tax
for education• Prop.38 Increase
income tax on almost
everyone for education
• Belief merging [Gabbay et al. JLC-09]
• Judgment aggregation [List and Pettit EP-02]
37
Other combinatorial domains
K1 merging operator
K2
Kn
…
Action P Action Q Liable? (P∧Q)
Judge 1 Y Y Y
Judge 2 Y N N
Judge 3 N Y N
Majority Y Y N
• Easy to compute: – the winner can be computed in polynomial time
• Hard to manipulate: – computing a beneficial false vote is hard
38
Computational axioms
Strategic behavior (of the agents)
• Manipulation: an agent (manipulator) casts a
vote that does not represent her true
preferences, to make herself better off
• A voting rule is strategy-proof if there is never a
(beneficial) manipulation under this rule
• How important strategy-proofness is as an
desired axiomatic property?
– compared to other axiomatic properties
Manipulation under plurality rule (ties are broken in favor of )
> >
> >
> >
>>
Plurality ruleAlice
Bob
Carol
Any strategy-proof voting rule?
• No reasonable voting rule is strategyproof
• Gibbard-Satterthwaite Theorem [Gibbard Econometrica-73,
Satterthwaite JET-75]: When there are at least three
alternatives, no voting rules except dictatorships satisfy
– non-imposition: every alternative wins for some
profile
– unrestricted domain: voters can use any linear order
as their votes
– strategy-proofness
• Axiomatic characterization for dictatorships!
• Relax non-dictatorship: use a dictatorship
• Restrict the number of alternatives to 2
• Relax unrestricted domain: mainly pursued
by economists
– Single-peaked preferences:
– Approval voting: A voter submit 0 or 1 for each
alternative
42
A few ways out
• Use a voting rule that is too complicated so that
nobody can easily predict the winner
– Dodgson
– Kemeny
– The randomized voting rule used in Venice Republic for
more than 500 years [Walsh&Xia AAMAS-12]
• We want a voting rule where
– Winner determination is easy
– Manipulation is hard43
Computational thinking
44
Overview
Manipulation is inevitable(Gibbard-Satterthwaite Theorem)
Yes
No
Limited information
Limited communication
Can we use computational complexity as a barrier?
Is it a strong barrier?
Other barriers?
May lead to very undesirable outcomes
Seems not very often
Why prevent manipulation?
How often?
If it is computationally too hard for a
manipulator to compute a manipulation,
she is best off voting truthfully
– Similar as in cryptography
For which common voting rules
manipulation is computationally hard?
45
Manipulation: A computational complexity perspective
NP-Hard
• Initiated by [Bartholdi, Tovey, &Trick SCW-89b]
• Votes are weighted or unweighted
• Bounded number of alternatives [Conitzer, Sandholm, &Lang JACM-
07]
– Unweighted manipulation: easy for most common rules
– Weighted manipulation: depends on the number of
manipulators
• Unbounded number of alternatives (next few slides)
• Assuming the manipulators have complete
information! 46
Computing a manipulation
Unweighted coalitional manipulation (UCM) problem
• Given
– The voting rule r
– The non-manipulators’ profile PNM
– The number of manipulators n’
– The alternative c preferred by the manipulators
• We are asked whether or not there exists a profile PM (of the manipulators) such that c is the winner of PNM∪PM under r
47
48
The stunningly big table for UCM
#manipulators One manipulator At least two
Copeland P [BTT SCW-89b] NPC [FHS AAMAS-08,10]
STV NPC [BO SCW-91] NPC [BO SCW-91]
Veto P [ZPR AIJ-09] P [ZPR AIJ-09]
Plurality with runoff P [ZPR AIJ-09] P [ZPR AIJ-09]
Cup P [CSL JACM-07] P [CSL JACM-07]
Borda P [BTT SCW-89b] NPC[DKN+ AAAI-11][BNW IJCAI-11]
Maximin P [BTT SCW-89b] NPC [XZP+ IJCAI-09]
Ranked pairs NPC [XZP+ IJCAI-09] NPC [XZP+ IJCAI-09]
Bucklin P [XZP+ IJCAI-09] P [XZP+ IJCAI-09]
Nanson’s rule NPC [NWX AAA-11] NPC [NWX AAA-11]
Baldwin’s rule NPC [NWX AAA-11] NPC [NWX AAA-11]
• For some common voting rules,
computational complexity provides some
protection against manipulation
• Is computational complexity a strong
barrier?
– NP-hardness is a worst-case concept
49
What can we conclude?
50
Probably NOT a strong barrier1. Frequency of
manipulability
2. Easiness of
Approximation3. Quantitative G-S
AAMAS-14 workshop
Computational Social Choice: Beyond the Worst Case
• Non-manipulators’ votes are drawn i.i.d.
– E.g. i.i.d. uniformly over all linear orders (the
impartial culture assumption)
• How often can the manipulators make c
win?– Specific voting rules [Peleg T&D-79, Baharad&Neeman
RED-02, Slinko T&D-02, Slinko MSS-04, Procaccia and
Rosenschein AAMAS-07]51
A first angle: frequency of manipulability
• Theorem. For any generalized scoring rule
– Including many common voting rules
• Computational complexity is not a strong barrier against
manipulation
– UCM as a decision problem is easy to compute in most cases
– The case of Θ(√n) has been studied experimentally in [Walsh
IJCAI-09]52
A general result [Xia&Conitzer EC-08a]
# manipulatorsAll-powerful
No powerΘ(√n)
• Unweighted coalitional optimization
(UCO): compute the smallest number of
manipulators that can make c win
– A greedy algorithm has additive error no more
than 1 for Borda [Zuckerman, Procaccia,
&Rosenschein AIJ-09]
53
A second angle: approximation
• A polynomial-time approximation algorithm that
works for all positional scoring rules
– Additive error is no more than m-2
– Based on a new connection between UCO for positional
scoring rules and a class of scheduling problems
• Computational complexity is not a strong barrier
against manipulation
– The cost of successful manipulation can be easily
approximated (for positional scoring rules)
54
An approximation algorithm for positional scoring rules[Xia,Conitzer,& Procaccia EC-10]
The scheduling problems Q|pmtn|Cmax
• m* parallel uniform machines M1,…,Mm*
– Machine i’s speed is si (the amount of work done in unit time)
• n* jobs J1,…,Jn*
• preemption: jobs are allowed to be interrupted (and resume later maybe on another machine)
• We are asked to compute the minimum makespan– the minimum time to complete all jobs 55
s2=s1-s3
s3=s1-s4
p1p
p2
p3
Thinking about UCOpos
• Let p,p1,…,pm-1 be the total points that c,c1,…,cm-1
obtain in the non-manipulators’ profile
p c
c1
c3
c2
∨
∨
∨
PNM V1
=
c
c1
c2
c3
p1 -pp1 –p-(s1-s2)
p p2 -pp2 –p-(s1-s4)
p p3 -pp3 –p-(s1-s3)
s1-s3
s1-s4
s1-s2
∪{V1=[c>c1>c2>c3]}
s1=s1-s2(J1)
(J2)
(J3)
56
57
The approximation algorithm
Original UCOScheduling
problem
Solution to the
scheduling problem
Solution to the
UCO
[Gonzalez&Sahni
JACM 78]
Rounding
No more than
OPT+m-2
• Manipulation of positional scoring rules =
scheduling (preemptions at integer time points)
– Borda manipulation corresponds to scheduling
where the machines speeds are m-1, m-2, …, 0
• NP-hard [Yu, Hoogeveen, & Lenstra J.Scheduling 2004]
– UCM for Borda is NP-C for two manipulators
• [Davies et al. AAAI-11 best paper]
• [Betzler, Niedermeier, & Woeginger IJCAI-11 best paper]
58
Complexity of UCM for Borda
• G-S theorem: for any reasonable voting rule there
exists a manipulation
• Quantitative G-S: for any voting rule that is “far
away” from dictatorships, the number of
manipulable situations is non-negligible
– First work: 3 alternatives, neutral rule [Friedgut, Kalai,
&Nisan FOCS-08]
– Extensions: [Dobzinski&Procaccia WINE-08, Xia&Conitzer EC-08b,
Isaksson,Kindler,&Mossel FOCS-10]
– Finally proved: [Mossel&Racz STOC-12]59
A third angle: quantitative G-S
• The first attempt seems to fail
• Can we obtain positive results for a
restricted setting?
– The manipulators has complete information
about the non-manipulators’ votes
– The manipulators can perfectly discuss their
strategies
60
Next steps
• Limiting the manipulator’s information can
make dominating manipulation computationally
harder, or even impossible [Conitzer,Walsh,&Xia
AAAI-11]
• Bayesian information [Lu et al. UAI-12]61
Limited information
• The leader-follower model
– The leader broadcast a vote W, and the potential
followers decide whether to cast W or not
• The leader and followers have the same preferences
– Safe manipulation [Slinko&White COMSOC-08]: a vote W
that
• No matter how many followers there are, the leader/potential
followers are not worse off
• Sometimes they are better off
– Complexity: [Hazon&Elkind SAGT-10, Ianovski et al. IJCAI-11]62
Limited communication among manipulators
• Procedure control by
– {adding, deleting} × {voters, alternatives}
– partitioning voters/alternatives
– introducing clones of alternatives
– changing the agenda of voting– [Bartholdi, Tovey, &Trick MCM-92, Tideman SCW-07, Conitzer,Lang,&Xia IJCAI-09]
• Bribery [Faliszewski, Hemaspaandra, &Hemaspaandra JAIR-09]
• See [Faliszewski, Hemaspaandra, &Hemaspaandra CACM-10] for a
survey on their computational complexity
• See [Xia Axriv-12] for a framework for studying many of these
for generalized scoring rules 70
Other types of strategic behavior
(of the chairperson)
71
Food for thought
• The problem is still very open!
– Shown to be connected to integer factorization [Hemaspaandra, Hemaspaandra, & Menton STACS-13]
• What is the role of computational complexity in
analyzing human/self-interested agents’ behavior?
– Explore information/communication assumptions
– In general, why do we want to prevent strategic
behavior?
• Practical ways to protect elections
72
Outline
1. Classical Social Choice
2.1 Computational aspectsPart 1
3. Statistical approaches
45 min
55 min
75 min
2.2 Computational aspectsPart 2
5 min
5 min
15 min
30 min
73
A B C
A B C
Turker 1 Turker 2 Turker n
…
> >
Ranking pictures [PGM+ AAAI-12]
...
.
.
.
....
. ....
. . .
.. .
. .
. .. . .
> > AB > B C>
Two goals for social choice mechanisms
GOAL1: democracy
74
GOAL2: truth
1. Classical Social Choice3. Statistical approaches
2. Computational aspects
Outline: statistical approaches
75
Condorcet’s MLE model(history)
A General framework
Why MLE? Why Condorcet’s
model?
Random Utility Models
Model selection
The Condorcet Jury theorem.
• Given
– two alternatives {a,b}.
– 0.5<p<1,
• Suppose
– each agent’s preferences is generated i.i.d., such that
– w/p p, the same as the ground truth
– w/p 1-p, different from the ground truth
• Then, as n→∞, the majority of agents’ preferences
converges in probability to the ground truth76
The Condorcet Jury theorem [Condorcet 1785]
Condorcet’s MLE approach• Parametric ranking model Mr: given a “ground truth” parameter Θ
– each vote P is drawn i.i.d. conditioned on Θ, according to Pr(P|Θ)
– Each P is a ranking
• For any profile D=(P1,…,Pn),
– The likelihood of Θ is L(Θ|D)=Pr(D|Θ)=∏P∈D Pr(P|Θ)
– The MLE mechanism
MLE(D)=argmaxΘ L(Θ|D)
– Break ties randomly
• What if Decision space ≠ Parameter space?
“Ground truth” Θ
P1 P2 Pn…
77
• Condorcet was not very clear how the Condorcet Jury theorem can
be extended to m>2
• Young had an interpretation [Young APSR-1988]
• Parameter space
– all combinations of opinions: an opinion is a pairwise comparison between
candidates (can be cyclic)
– p<1
• Sample space
– all combinations of opinions
• Given “ground truth” opinions W and p<1, generate opinions V s.t.
each opinion is i.i.d.
78
Condorcet’s model
c≻d in W
c≻d in Vp
d≻c in V1-p
• Parameter space
– all rankings over candidates
– ϕ<1
• Sample space
– all rankings over candidates
• Given a “ground truth” ranking W and ϕ<1,
generate a ranking V w.p.
– Pr(V|W) ∝ ϕ Kendall(V,W)
• MLE ranking is the Kemeny rule 79
Mallows model [Mallows 1957]
• Learning [Lu and Boutilier ICML-11]
• Approximation by common voting rules [Caragiannis, Procaccia & Shah EC-13]
80
Recent studies on Condorcet/Mallows model
Outline: statistical approaches
81
Condorcet/Mallows model(history)
Why MLE? Why Condorcet’s
model?
A General framework
82
Statistical decision framework[Azari, Parkes, and Xia WINE poster]
ground truth Θ
P1 Pn……
Mr
Decision (winner, ranking, etc)
Information about theground truth
P1 P2 Pn…
Step 1: statistical inference
Data D
Given MrStep 2: decision making
Mr = Mallows model
Step 1: MLE
Step 2: top-alternative
83
Example: Kemeny
Winner
The most probable ranking
P1 P2 Pn…
Step 1: MLE
Data D
Step 2: top-1 alternativeDecision space:
A unique winner
• Likelihood reasoning
– there is an unknown
but fixed ground truth
– p = 10/14=0.714
– Pr(2heads|p=0.714)
=(0.714)2=0.51>0.5
– Yes!84
Likelihood reasoning vs. Bayesian in general
• Bayesian
– the ground truth is
captured by a belief
distribution
– Compute Pr(p|Data)
assuming uniform prior
– Compute Pr(2heads|
Data)=0.485<0.5
– No!
Credit: Panos Ipeirotis
& Roy Radner
• You have a biased coin: head w/p p
– You observe 10 heads, 4 tails
– Do you think the next two tosses will be two heads in a row?
Mr = Mallows model
85
Kemeny = Likelihood approach
Winner
The most probable ranking
P1 P2 Pn…
Step 1: MLE
Data D
Step 2: top-1 alternative
This is the Kemeny rule
(for single winner)!
Mr = Condorcet model
Step 1: compute the likelihood
for all parameters (opinions)
Step 2: choose the top-
alternative of the most
probable ranking
86
Kemeny = Likelihood approach (2)
Winner
The most probable ranking
P1 P2 Pn…
Step 1: compute the likelihood
Data D
Step 2: top-1 alternative
87
Example: Bayesian [Young APSR-88]
Mr = Condorcet’ model
Winner
Posterior over rankings
P1 P2 Pn…
Step 1: Bayesian update
Data D
Step 2: mostly likely top-1
Anonymity, neutrality,
monotonicityConsistency Condorcet
Easy to compute
Likelihood(Mallows)
Y NY N
Bayesian(Condorcet) N Y
88
Likelihood vs. Bayesian[Azari, Parkes, and Xia WINE poster]
Decision space: single winners
Assume uniform prior in the Bayesian approach
Principle: Statistical decision theory
Outline: statistical approaches
89
Condorcet’s MLE model(history)
Why MLE? Why Condorcet’s
model?
A General framework
• When the outcomes are winning alternatives
– MLE rules must satisfy consistency: if r(D1)∩r(D2)≠ϕ,
then r(D1∪D2)=r(D1)∩r(D2)
– All classical voting rules except positional scoring
rules are NOT MLEs
• Positional scoring rules are MLEs
• This is NOT a coincidence!
– All MLE rules that outputs winners satisfy anonymity and
consistency
– Positional scoring rules are the only voting rules that satisfy
anonymity, neutrality, and consistency! [Young SIAMAM-75]
90
Classical voting rules as MLEs [Conitzer&Sandholm UAI-05]
• When the outcomes are winning rankings
– MLE rules must satisfy reinforcement (the
counterpart of consistency for rankings)
– All classical voting rules except positional scoring
rules and Kemeny are NOT MLEs
• This is not (completely) a coincidence!
– Kemeny is the only preference function (that outputs
rankings) that satisfies neutrality, reinforcement, and
Condorcet consistency [Young&Levenglick SIAMAM-78]91
Classical voting rules as MLEs [Conitzer&Sandholm UAI-05]
• Condorcet’s model
– not very natural
– computationally hard
• Other classic voting rules
– Most are not MLEs
– Models are not very natural either
92
Are we happy?
New mechanisms via the statistical decision framework
Model selection
– How can we evaluate fitness?
• Likelihood or Bayesian?
– Focus on MLE
• Computation
– How can we compute MLE efficiently?
93
Decision
Information about theground truth
Data D
decision making
inference
• Closely related, but
– We need economic insight to build the model
– We care about satisfaction of traditional social
choice criteria
• Also want to reach a compromise (achieve
democracy)
94
Why not just a problem of machine learning or statistics?
Outline: statistical approaches
95
Condorcet’s MLE model(history)
A General framework
Why MLE? Why Condorcet’s
model?
Random Utility Models
• Continuous parameters: Θ=(θ1,…, θm)
– m: number of alternatives
– Each alternative is modeled by a utility distribution μi
– θi: a vector that parameterizes μi
• An agent’s perceived utility Ui for alternative ci is generated
independently according to μi(Ui)
• Agents rank alternatives according to their perceived utilities
– Pr(c2≻c1≻c3|θ1, θ2, θ3) = PrUi ∼ μi (U2>U1>U3)
96
Random utility model (RUM)[Thurstone 27]
U1 U2U3
θ3 θ2 θ1
• Pr(Data |θ1, θ2, θ3) = ∏R∈Data Pr(R |θ1, θ2, θ3)
97
Generating a preference-profile
Parameters
P1= c2≻c1≻c3Pn= c1≻c2≻c3…
Agent 1 Agent n
θ3 θ2 θ1
• μi’s are Gumbel distributions
– A.k.a. the Plackett-Luce (P-L) model [BM 60, Yellott 77]
• Equivalently, there exist positive numbers λ1,…,λm
• Pros:
– Computationally tractable
• Analytical solution to the likelihood function
– The only RUM that was known to be tractable
• Widely applied in Economics [McFadden 74], learning to rank [Liu 11], and
analyzing elections [GM 06,07,08,09]
• Cons: does not seem to fit very well98
RUMs with Gumbel distributions
c1 is the top choice in { c1,…,cm }c2 is the top choice in { c2,…,cm }cm-1 is preferred to cm
• μi’s are normal distributions
– Thurstone’s Case V [Thurstone 27]
• Pros:
– Intuitive
– Flexible
• Cons: believed to be computationally intractable
– No analytical solution for the likelihood function Pr(P |
Θ) is known
99
RUM with normal distributions
Um: from -∞ to ∞ Um-1: from Um to ∞ … U1: from U2 to ∞
• Location family: RUMs where each μi is parameterized
by its mean θi
– Normal distributions with fixed variance
– P-L
• Theorem. For any RUM in the location family, if the PDF
of each μi is log-concave, then for any preference-profile
D, the likelihood function Pr(D|Θ) is log-concave
– Local optimality = global optimality
– The set of global maxima solutions is convex100
Unimodality of likelihood[APX. NIPS-12]
• Utility distributions μl’s belong to the exponential family
(EF)
– Includes normal, Gamma, exponential, Binomial, Gumbel, etc.
• In each iteration t
• E-step, for any set of parameters Θ
– Computes the expected log likelihood (ELL)
ELL(Θ| Data, Θt) = f (Θ, g(Data, Θt))
• M-step
– Choose Θt+1 = argmaxΘ ELL(Θ| Data, Θt)
• Until |Pr(D|Θt)-Pr(D|Θt+1)|< ε101
Approximately computed
by Gibbs sampling
MC-EM algorithm for RUMs [APX NIPS-12]
Outline: statistical approaches
102
Condorcet’s MLE model(history)
A General framework
Why MLE? Why Condorcet’s
model?
Random Utility Models
Model selection
103
Model selection
Value(Normal)- Value(PL)
LL Pred. LL AIC BIC
44.8(15.8) 87.4(30.5) -79.6(31.6) -50.5(31.6)
• Compare RUMs with Normal distributions and PL for
– log-likelihood
– predictive log-likelihood,
– Akaike information criterion (AIC),
– Bayesian information criterion (BIC)
• Tested on an election dataset
– 9 alternatives, randomly chosen 50 voters
Red: statistically significant with 95% confidence
• Generalized RUM [APX UAI-13]
– Learn the relationship between agent features
and alternative features
• Preference elicitation based on experimental
design [APX UAI-13]
– c.f. active learning
• Faster algorithms [ACPX, NIPS-13]
– Generalized Method of Moments (GMM)104
Recent progress
Computational thinking + optimization algorithms
CSSocial
Choice
Strategic thinking + methods/principles of aggregation
2. Computational aspects 3. Statistical approaches
• Easy-to-compute axiom• Hard-to-manipulate axiom
• Computational thinking +
game-theoretic analysis
• Framework based on
statistical decision theory• Model selection
• Condorcet/Mallows vs. RUM
Thank you!