markov degree of the birkhoff model - university of...

33
Markov degree of the Birkhoff model Akimichi Takemura 12 joint with T. Yamaguchi 1 and M. Ogawa 1 1 University of Tokyo 2 JST CREST January, 2014 at CASTA2014 1 / 33

Upload: doxuyen

Post on 21-Jul-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Markov degree of the Birkhoff model

Akimichi Takemura12

joint withT. Yamaguchi1 and M. Ogawa1

1University of Tokyo

2JST CREST

January, 2014 at CASTA2014

1 / 33

Outline

1 The Birkhoff model

2 Markov degree

3 (n, r)-Birkhoff model

4 Main theorem and idea of our proofLatin squaresSketch of our proof

2 / 33

Reference

“Markov degree of the Birkhoff model”T. Yamaguchi, M. Ogawa and A. TakemuraJournal of Algebraic CombinatoricsDOI: 10.1007/s10801-013-0488-z

3 / 33

Definition of the Birkhoff model

Voters are asked to rank n candidates (“totalranking”).

Let Sn = {σ1, · · · , σn!} be the set of possible votes.

σ = (σ(1), · · · , σ(n)) ∈ Sn, where σ(j) denotes thecandidate ranked at the jth position.

pσ denotes the probability to observe σ ∈ Sn.

The Birkhoff model

log pσ = ψ0 +n∑

j=1

ψjσ(j)

4 / 33

Definition of the Birkhoff model (cont.)

The Birkhoff model

log pσ = ψ0 +n∑

j=1

ψjσ(j)

If ψjk is large, the candidate k is likely to be rankedat the jth position.

The sufficient statistic consists of the numbersvoters who rank the candidate k at the jth position.

5 / 33

Hypothesis testing

Does the Birkhoff fit the observed data?

Conditional testing with chi-square statistic.

The size of the fiber |Ft | is huge.⇒ Metropolis–Hastings algorithm with Markovbases (Diaconis&Sturmfels 1998)

6 / 33

Markov basis

Markov basisx : vector of frequenciest : the sufficient statistic for xA : configuration matrix satisfying t = Ax

B ⊂ kerA ∩ Zn! is a Markov basis⇔ ∀t and ∀x , y ∈ Ft , there exist M > 0, zi ∈ B ,ϵi ∈ {−1, 1}, i = 1, . . . ,M , such that

y = x +∑M

i=1 ϵizi , x +∑m

i=1 ϵizi ∈ Ft , 1 ≤ m ≤ M .

If we have a Markov basis, then we can construct aMarkov chain.

7 / 33

Configuration matrix of the Birkhoff model

Configuration matrix A of the Birkhoff model forn = 3:

(123) (132) (213) (231) (312) (321)(1, 1) 1 1 0 0 0 0(1, 2) 0 0 1 1 0 0(1, 3) 0 0 0 0 1 1(2, 1) 0 0 1 0 1 0(2, 2) 1 0 0 0 0 1(2, 3) 0 1 0 1 0 0(3, 1) 0 0 0 1 0 1(3, 2) 0 1 0 0 1 0(3, 3) 1 0 1 0 0 0

The columns are labeled by σ ∈ S3 and the rows arelabeled by (j , k) = (position, candidate).

8 / 33

Markov degree

Degree of movesDegree of z ∈ B =

∑z>0 z =

∑z<0−z = ∥z∥1/2

Markov degreeMarkov degree = maximum degree of moves in theminimal Markov basis

Conjecture of Diaconis&Eriksson 2006Markov degree of the Birkhoff model is three (i.e. the toric

ideal is generated by binomials of degree at most three, but not

two).

9 / 33

Size of the Markov bases for the Birkhoff model

Let n be the number of candidates. The tableshows the size of the minimal Markov basis for theBirkhoff model. (Diaconis&Eriksson 2006)

n deg.2 deg.3 deg.4 deg.5 deg.63 0 1 0 0 04 18 160 0 0 05 1050 28840 0 0 06 57510 7056240 0 0 0

We know many examples of configurations whoseMarkov degree is two. However, not manyconfigurations with Markov degree three are known.

10 / 33

(n, r)-Birkhoff model

Consider an election that each voter is asked to giver (≤ n) preferred candidates and to rank them(“partial ranking”).

For such case we define the (n, r)-Birkhoff model.

The set of possible votes is Sn,r = {σ1, · · · , σ n!(n−r)!}

σ = (σ(1), · · · , σ(r)) ∈ Sn,r ,

where σ(j) denotes the candidate at the jthposition.

11 / 33

(n, r)-Birkhoff model (cont.)

pσ: the probability to observe σ ∈ Sn,r .

(n, r)-Birkhoff mode

log pσ = ψ0 +r∑

j=1

ψjσ(j)

The sufficient statistic consists of the numbersvoters who rank the candidate k at the jth position.(the same as in the Birkhoff model)

12 / 33

Configuration of the (n, r)-Birkhoff model

The configuration matrix A of the (3, 2)-Birkhoffmodel is

(12) (13) (21) (23) (31) (32)(1, 1) 1 1 0 0 0 0(1, 2) 0 0 1 1 0 0(1, 3) 0 0 0 0 1 1(2, 1) 0 0 1 0 1 0(2, 2) 1 0 0 0 0 1(2, 3) 0 1 0 1 0 0

The columns are labeled by σ ∈ S3,2 and the rowsare labeled by (j , k) = (position, candidate).

13 / 33

Size of the Markov bases

The number of moves of the minimal Markov basesfor (n, r)-Birkhoff model:

n r deg.2 deg.3 deg.4 or more3 2 0 1 04 2 6 4 04 3 18 160 05 2 30 10 05 3 360 1000 05 4 1050 28840 06 2 90 20 06 3 2160 3680 07 2 210 35 07 3 8190 10325 08 2 420 56 08 3 23940 24416 0

14 / 33

Main result

TheoremFor n ≥ 3, r ≥ 2, the Markov degree of the(n, r)-Birkhoff model is three.

Our proof is based on the proof ofJacobson&Matthews (1998) for Latin squares.

15 / 33

Latin squares

A Latin square contains all symbols in each row andcolumn.

a b c d eb c d e ac d e a bd e a b ce a b c d

By swap operations among at most three rows,every Latin square can be generated.(Jacobson&Matthews 1998)

16 / 33

Example of swap operation

If we swap a in (1, 1) entry and b in (2, 1) entry, theresulting table is not a Latin square.

a b c d eb c d e ac d e a bd e a b ce a b c d

b b c d ea c d e ac d e a bd e a b ce a b c d

17 / 33

Representation of the dataset

N voters rank r preferred candidates chosen from ncandidates. The observed dataset is represented bya N × r matrix:

x11 x12 · · · x1rx21 x22 · · · x2r...

......

...xN1 xN2 · · · xNr

xij denotes the candidate in the ith vote (row) atthe the jth position. Note that every candidate isranked at most once in each vote.Denote the sufficient statistic by t and the fiber byFt .

18 / 33

Operation on the fiber Ft

Every element in Ft is obtained by swaps of thecandidates in the same column.

Each swap corresponds to a move.

To prove that Markov degree is three, it is enoughto show that we only need swap operations involvingat most three rows to generate all elements in Ft .

19 / 33

Example of the operation on Ft

It is not allowed to stop the operation when somerow contains the same candidate more than once.We have to somehow resolve the “collision”.

a b cb c dc d ed e a

b b ca c dc d ed e a

20 / 33

Example of the operation on Ft

The following is an example of a possible operation:a b cb c dc d ed e a

b d ca c dc b ed e a

21 / 33

Example of the operation on Ft

This operation involves four rows at the same time.a b cb c dc d ed e a

d b ca c db d ec e a

22 / 33

Example of the operation on Ft

In this operation, each swap involves at most threerows.

a b cb c dc d ed e a

d b cb c da d ec e a

d b ca c db d ec e a

23 / 33

Improper dataset (needed just for proof)

Let Ft be the set of N × r matrices obtained byadding matrices with one element of the forma + b − c . (Ft ⊂ Ft)

We call a + b − c an improper element.

a + b − c is understood as a, b appearing once andc appearing −1 time. Every row should containeach candidates zero time or once in total.

We call a matrix in Ft \ Ft an improper matrix anda matrix in Ft a proper matrix.

24 / 33

Example of an improper matrix

The following matrix is an example of impropermatrices in the case of N = 4, r = 3, n = 5. The setof candidates is {a, b, c , d , e}.

d b ca + b − d c d

c d ed e a

25 / 33

Operation on Ft

Consider the operations on Ft. We add ±(a − b)preserving the sufficient statistic.

For Ft, we only consider operations involving tworows of Ft at a time.

a b cb c dc d ed e a

±(a−d)−→

d b c

a + b − d c dc d ed e a

26 / 33

Resolvable pair

If there exists an improper element a + b − c , thenc appears in the same column.

In this case, the improper matrix can betransformed to a proper matrix by the operationamong the these rows.[

d b ca + b − d c d

]±(a−d)−→

[a b cb c d

]We call the pair of rows above a resolvable pair.

27 / 33

Lemma

We call a pair of votes R of two improper matrices I , I ′ acompatible pair, if there exists a common resolvable pairR ′ of I , I ′, such that |R ∪ R ′| ≤ 3.

LemmaAll elements of Ft can be generated by the operations onthe compatible pairs.

28 / 33

Sketch of the proof

For two proper matrices P ,P ′ ∈ Ft, we cantransform P to P ′ by the operations on thecompatible pairs.

Decompose the process from P to P ′ into thesegments that consist of transformations from aproper dataset to another proper dataset:

P1 ←→ I1 ←→ · · · ←→ Ij ←→ Ij+1

←→ · · · ←→ Im ←→ Pm

Each ←→ denotes the operation among two rows.

29 / 33

Sketch of the proof (cont.)

For each pair of two improper matrices Ij , Ij+1, thereexist proper matrices Pj ,P

′j ,P

′j+1 satisfying

Pj ←→ Ij ←→ Ij+1 ←→ P ′j+1 (1)

P ′j ←→ Ij ←→ Pj (2)

Pj ,P′j ,P

′j+1 are inserted in order to avoid improper

matrices temporarily:

30 / 33

Sketch of the proof (cont.)

The operations in (2) involves three rows in total,since both of the operations P ′j ←→ Ij andIj ←→ Pj involve a common improper element.

The transformation of (1) is achieved by operationsinvolving three rows, because

By the compatibility, the operation in Ij ←→ Ij+1

involves one of the rows of a resolvable pair.By the operations on the resolvable pairs, Ij and Ij+1 canbe transformed to proper matrices Pj ,P

′j+1.

31 / 33

Sketch of the proof (cont.)

The transformation from Pj to P ′j+1 and from P ′j toPj can be performed by the operations on Ft

involving at most three rows, respectively.

This proves the theorem.

32 / 33

Conclusion

We have proved the conjecture of Diaconis&Eriksson2006 in a generalized setting:

For n ≥ 3, r ≥ 2, the Markov degree of the(n, r)-Birkhoff model is three.

33 / 33